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ABSTRACT 

A finite  probabilistic  system  (FPS)  is  a stationary  discrete-time 
controlled  stochastic  dynamical  process,  having  finite  input,  output, 
and  (internal)  state  sets.  The  partially-observable  Markov  decision  pro- 
cess is  an  example  of  such  a system.  FPS  formulations  provide  a convenient 
framework  for  the  study  of  problems  of  state  estimation,  statistical 
decision,  or  control,  where  state  information  is  available  only  through 
a finite  memoryless  channel,  and  observation  dynamics  may  depend  on  the 
inputs  selected. 

Notions  of  reachability  and  detectability  in  FPS's  (similar  to 
controllability  and  observability  in  linear  systems)  are  made  precise. 

It  is  shown  that  every  FPS  can  be  reduced  to  components  that  are  either 
reachable  and  detectable,  or  transient,  or  null-recurrent. 

It  is  well  known  that  the  information  vector  (whose  i-th  entry  is 
the  a posteriori  probability  that  the  system  is  in  state  i)  is  a suffi- 
cient statistic  (for  the  estimation  of  future  dynamics  given  past  inputs 
and  outputs).  A contraction  property  of  the  information  vector  tran- 
sition function  is  exploited  to  obtain  procedures  for  £-optimal  (arbi- 
trarily close)  approximation  of  the  information  vector  by  a deterministic 
time-invariant  finite-memory  observer.  Each  observer  state  corresponds 
to  a particular  configuration  of  most  recent  input-output  pairs.  The 
average  error  achieved  by  such  an  approximation  is  bounded  by  the  ex- 
pression (m/nig)  T,  where  m^  and  T are  parameters  associated  with  the 
observed  system,  and  m is  the  number  of  observer  states. 


1 
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Control  problems,  In  which  the  average  reward  Is  maximized  over 
a discounted  or  undiscounted  infinite  horizon,  may  be  solved  by  an 
iterative  procedure  which  has  been  given  the  name  perceptive  dynamic 
programming.  Successively  weaker  assumptions  that  the  controller 
"perceives"  unavailable  state  values  transform  the  problem  into  a 
sequence  of  formulations  which  may  be  solved  by  dynamic  programming. 
Each  solution  obtained  in  this  manner  is  used  to  construct  a feasible 
controller  formulation,  taking  the  form  of  a deterministic  time- 
invariant  finite-state  automaton.  Monotone  geometrically  convergent 
bounds,  containing  both  the  supremum  feasible  performance  and  that  of 
the  current  design,  are  also  obtained.  Computation  may  be  terminated 
when  these  bounds  become  sufficiently  close,  or  when  the  number  of 
controller  states  becomes  excessively  large.  Although  computing  a 
solution  by  perceptive  dynamic  programming  may  require  considerable 
time  and  storage,  both  are  roughly  proportional  to  the  number  of 
controller  states  allowed  in  the  final  iteration;  thus  the  cost  of 
controller  design  reflects  the  rnnt  nf  controller  implementation. 

This  procedure  was  applied  to  idealized  problems  of  machine  main- 
tenance and  computer  communication,  both  of  which  had  been  investigated 
by  other  researchers.  The  first  problem  was  solved  exactly;  a design 
suitable  close  to  the  optimum  was  obtained  for  the  second  problem. 


NAME  AND  TITLE  OF  THESIS  CO-SUPERVISORS: 

Alvin  Drake 
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NOTATIONS 


If  A and  B are  sets,  then  A-B  Is  the  set  of  elements  in  A that  are 

A 

not  contained  in  B.  If  A is  the  number  of  elements  in  A.  B is  the  set 

A 

of  mappings  from  A to  B.  2 is  the  set  of  subsets  of  A.  0 is  the  null 
set. 


<a,b>  is  the  seft  of  integers  i satisfying  a<i<b.  The  sequence 

{A  , A A,  , , A.  } is  denoted  {A,},  _ , arb  denotes  integer 

a a+1  b-1  b k k£<a,b> 

quotient  rounded  down,  i.e.  the  integer  q of  largest  magnitude  such  that 
| bq  |<J  a | and  sgn(bq)  = sgn(a).  = fcj  fc)  j is  t^le  binomial 

coefficient  tur  n items  taken  k at  a time. 

[a,b]  is  the  set  of  real  numbers  x satisfying  a£x<b;  similarly  [a,b) 
= ta,b]-(b}.  (a)+ = max(a,0)  and  (a)  = min(a,0);  clearly  a =>  (a)+  + (a)  . 

denotes  the  Euclidean  space  of  column  N-vectors.  A row  vector 
tt  is  substochastic  if  its  entries  are  all  nonnegative  and  sum  to  a 
quantity  not  exceeding  unity;  it  is  stochastic  if  it  is  substochastic 
and  the  sum  of  its  entries  is  exactly  one.  TL,  and  IL,  denote  the  sets 
of  stochastic  and  substochastic  row  N-vectors,  respectively.  A square 
matrix  is  stochastic  (substochastic)  if  each  of  its  rows  is  a stochastic 
(substochastic)  vector,  v^  denotes  the  i*1*1  entry  of  vector  v;  similarly, 
is  the  i j entry  of  matrix  P,  and  row^[P]  is  the  row  vector  whose 

ij1*1  entry  is  P . The  superscript  "T"  denotes  transpose,  e^  is  the 

t h 

"unit”  vector  whose  i entry  is  unity  and  whose  remaining  entries  equal 
zero;  0 is  a vector  of  zeroes  and  1 is  a vector  whose  every  entry  equals 
unity;  the  dimension  and  inclination  (row  or  column)  of  e* , 0,  and  1, 


are  determined  by  context.  The  usual  rules  of  matrix  algebra  will  be 


observed ; 

If  xe 
to  imply  x 


thus  if  TrelTN  and  qe^,  then  the  quantity  TTq  is  a scalar. 

Rjj,  t^len  lxl  - lxjl*  If  x.yeR^,  then  x<y  is  understood 

i<yi,  ViG<l,N>,  and  x<y  implies  Xj<y  , Vie<l,N>. 
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CHAPTER  I 
PRELIMINARIES 


1.  Introduction 


4 


This  dissertation  introduces  concepts  and  associated  computational 
procedures  that  are  applicable  to  a mathematical  problem  arising  in  the 
context  of  Operations  Research  and  Stochastic  Control.  Briefly  stated, 
the  problem  is  to  design  a strategy  for  real-time  decision-making  on  the 
basis  of  imperfect  (state)  information  and  finite  memory.  The  plants. 


(i.e.  the  object  to  be  controlled)  is  modelled  as  a^  finite  probabilistic 


system  (FPS)  or  stationary  discrete-time  finite-input  finite-output 
finite-state  controlled  stochastic  process,  a generalization  of  the 
partially-observed  Markov  decision  model  initiated  by  Drake  (1962)  , which 
itself  generalizes  the  Markov  decision  model  of  Bellman  (1957a). 

An  engineering  problem  which  might  be  tackled  by  the  methods  espoused 
in  this  dissertation  is  the  following: 


(1.1)  Machine  Maintenance  and  Repair  Problem  (Scenario) . A factory  con- 
tains a large  number  of  identical  machines,  each  of  which  may  require 
overhaul  from  time  to  time.  A repairman  maintain^  a "status  report"  for 
each  machine  and  effects  the  overhauls.  Unfortunately,  a lengthy  in- 
spection procedure  must  be  performed  in  order  to  determine  whether  or 
not  a particular  machine  is  actually  in  need  of  an  overhaul.  Thus  it 
is  clearly  impractical  and  undesirable  to  inspect  every  machine  daily. 

For  example,  if  a certain  machine  is  believed  likely  to  require  overhaul. 
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it  might  be  advisable  to  overhaul  that  machine  without  Inspecting  it  at 
all.  The  problem  is  to  determine  a simple  rule  for  the  repairman  to 
follow  in  making  decisions  for  individual  machines,  and  in  recording 
each  machine's  status.  A solution  to  this  problem  may  be  visualized  as 
a manual  in  which  every  possible  machine  status  is  listed,  along  with 
a course  of  action  and  a new  status  resulting  from  that  action.  The 
status  code  must  be  reasonably  concise,  for  otherwise  the  manual  will 
assume  mammoth  proportions.  Given  the  relative  undesirability  of  broken 
machines  and  repair  costs,  as  well  an  a set  of  admissible  actions,  the 
problem  may  be  expressed  as  that  of  determining  the  optimal*  (most 
desirable)  strategy  for  coding  machine  status  and  repairing  machines, 
as  realized  by  the  policy  specified  in  the  repairman's  manual. 

Generalizations : A similar  scenario  might  involve  a crowded  hospital 

in  which  patients  are  visited  by  a doctor  who  must  decide,  on  the  basis 
of  previous  visits,  how  to  allocate  his  time.  The  controller  might  also 
be  a computer.  Possible  applications  include:  routing  "packets"  through 

a telecommunications  network,  controlling  traffic  signals  at  a busy  inter- 
section or  along  a congested  freeway,  and  scheduling  shipments  from  aware- 
house  serving  several  retail  outlets. 

Engineering  problems  of  this  type  necessarily  require  that  a trade- 
off be  made  between  accuracy  of  the  model  in  depicting  the  "real"  problem 
and  solvability  of  the  problem  described  by  the  model.  The  FPS  model  is 

+The  optimum  may  not  exist;  e-optimal  strategies  are  then  sought. 
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more  general  than  a Markov  decision  model;  it  is  also  more  difficult 
to  solve.  The  Markov  decision  model  assumes  that  perfect  state 
information  is  available  to  the  decision-maker.  In  the  Machine  Main- 
tenance and  Repair  Problem,  this  means  that,  in  order  to  use  a Markov 
decision  model,  it  would  be  necessary  to  assume  that  the  repairman  knows 
at  all  times  whether  or  not  a particular  machine  is  operating  properly; 
his  course  of  action  is  then  obvious.  The  applications  envisioned  for 
an  FPS  decision  theory  are  those  in  which  the  decision  to  seek  information 
is  crucial,  and  for  which  the  Markov  decision  model  is,  consequently, 
inadequate. 

More  specifically,  two  possible  aspects  of  "real"  control  problems 
are  captured  by  the  FPS  formulation,  but  totally  ignored  in  Markov 
decision  theory.  One  aspect  is  the  "dual  control"  phenomenon,  where  the 
decision-maker  must  decide  whether  to  seek  better  state  information  at 
the  expense  of  short-term  performance,  or  to  seek  improved  immediate 
performance  at  the  expense  of  information  forgone  in  the  interim.  The 
other  aspect  is  the  "saturation"  phenomenon,  in  which  the  decision-maker 
is  confronted  with  more  information  than  may  be  considered  in  the  time 
allotted  for  decision-making.  Conventional  linear-quadratic-Gaussian 
control  methods,  likewise,  avoid  "dual  control"  and  "saturation" 
phenomena  by  requiring  that  observation  dynamics  be  unaffected  by  the 
input  process. 

In  problems  such  as  the  Machine  Maintenance  and  Repair  Problem, 
where  information  is  available  only  at  a cost,  perfect  state  information 
cannot  be  taken  for  granted,  and  separation  of  input  and  output  dynamics 


* 

\ 
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does  not  occur.  At  the  heart  of  the  problem  is  the  determination  of  what 
information  is  important  for  purposes  of  decision-making,  and  what 
information  may  be  disregarded.  An  important  contribution  of  this 
research  is  a bound  on  the  value  of  information.  When  the  cost  of 
obtaining  information  exceeds  its  value,  then  it  is  advisable  to  do  with- 
out that  information. 

The  elimination  of  "dual  control"  immediately  leads  co  a "saturation" 
condition,  since  the  decision  whether  to  seek  further  information  must 
be  based  on  all  information  acquired  lllus  far.  Fortunately,  the  value 
of  information  decreases  geometrically  with  delay , in  most  FPS's.  Thus, 
for  any  e>0,  there  is  an  integer  Z such  that  the  value  of  all  infor- 
mation delayed  by  Z or  more  time  units  has  value  less  than  e.  This 
implies  that  there  exists  an  e-optimal  strategy  (a  strategy  whose  per- 
formance lies  within  £ of  the  supremum  feasible  performance)  for  de- 
cision-making based  on  the  most  recent  Z inputs  and  outputs  alone.  A 
computational  method  for  strategy  optimization,  based  on  this  result, 
has  been  given  the  name  perceptive  dynamic  programming. 

As  the  number  of  most  recent  input-output  pairs  retained  by  the 
decision-maker  increases,  the  loss  in  performance  from  discarded  infor- 
mation decays  geometrically  and  the  number  of  memory  states  (called 
"status  codes"  in  (1.1))  increases  geometrically.  Thus,  the  performance 
achieved  by  a decision-maker  acting  on  the  basis  of  m memory  states  can 
be  made  to  lie  within  (m/m0)  T of  the  supremum  feasible  performance, 
where  m^  is  the  number  of  values  in  a sufficient  incremental  statistic, 


\ 

\ 


and 
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The  remainder  of  this  report  is  devoted  to  making  precise  the 
concepts  outlined  above.  The  FPS  model  is  described  in  detail  in  the 
following  section.  The  Machine  Maintenance  and  Repair  Problem  is 
formulated  as  an  FPS  control  problem  and  solved  in  Section  3.  A review 
of  related  work,  a compendium  of  original  contributions,  and  an  outline 
of  the  report  complete  this  chapter. 
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2 . The  Model 


a.  Representation  of  the  Plant. 

The  plant  will  be  modeled  as  an  FPS,  which  is  defined  by  (2.1), 
below.  Conceptually,  an  FPS  is  a generalization  of  a Markov  chain, 
shown  in  Figure  2-1.  A Markov  chain  has  the  property  that,  for  any 
time  ke<l,“>,  the  random  variables  {s(k')  k 1;>  and 

{s(k' ) )k, e<k+i  are  conditionally  independent  given  s(k).  Thus  the 

transition  probability  that  s (k-l  1 ^ will  assume  value  j given  the  values 

of  all  past  states  (s(k')}  , ...  can  be  expressed  as  a function  of 

k e<0,k> 

the  value  of  s(k)  alone.  The  broken  arrow  leading  from  s(k)  to  s(k+l), 
in  Figure  2-1,  is  intended  to  convey  a sense  that  s(k+l)  evolves  pro- 
babilitically  from  s(k)  alone. 


s(k  - 1 ) 


s(k) 


s(k  + t)  - • • • 


Figure  2-1.  A Markov  Chain 


In  .1  Markov  decision  process,  shown  in  Figure  2-2,  the  transition 


probabilities  depend  on  inputs  that  are  provided  to  the  system  by  a 
decision-maker.  Input  u(k)  determines  the  manner  in  which  s(k+l)  evolves 
probabilistically  from  s(k).  If  inputs  are  selected  on  the  basis  of  the 
most  recent  state  alone,  then  the  system  becomes  a Markov  chain. 

•••  — ►s(k-l)  -^vrv/vr^  s(k)  s(k  + 1)  - ••• 

/ / 

u (k- 1 ) u(k) 

Figure  2-2.  A Markov  Decision  Process 

A partially-observable  Markov  decision  process,  shown  in  Figure  2-3, 
combines  a Markov  decision  process  with  a process  of  noisy  outputs.  Out- 
put y(k)  depends  probabilistically  on  s(k)  alone.  It  is  easy  to  see  that 
a partially-observable  Markov  decision  process  is  entirely  equivalent  to 
a Markov  decision  process  whose  state  at  time  k consists  of  the  pair 
[s(k),y(k)];  y(k)  then  becomes  a perfect  observation  of  the  second  state 
component,  and  is  referred  to  as  an  "incomplete"  state  observation. 


y(k-l) 


y(k) 


y(k+1) 


•••  — ► s(k- 1) 

/ 

u(k-1) 


s(k)  -'\/\/\^**s(k+l)-  •• 

/ 

u(k) 


Figure  2-3.  A Partially-observable  Markov  Decision  Process 
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y(k+1) 


s(k+1)  ••• 


Figure  2-4.  A Finite  Probabilistic  System 

A finite  probabilistic  system  is  shown  in  Figure  2-4.  Output  y(k) 
now  depends  probabilistically  on  s(k-l),  u(k-l),  and  s(k),  and  may  be 
thought  of  as  a noisy  measurement  of  the  most  recent  state  transition. 
Yet,  an  FPS  is  always  equivalent  to  a Markov  decision  process  whose  state 
at  time  k consists  of  the  pair  [s(k),y(k)].  Thus,  every  partially- 
observable  Markov  decision  process  is  an  FPS,  and  any  FPS  may  be  trans- 
formed into  a partially-observable  Markov  decision  process.  The  dis- 
tinction between  the  two  lies  in  their  representations , i.e.  in  the 
notation  used  to  describe  them. 

Since  s(k)  depends  probabilistically  on  s(k-l)  and  u(k-l) , the  pair 
s(k)  and  y(k)  may  be  viewed  as  random  variables  that  depend  jointly  on 
s(k-l)  and  u(k-l).  In  this  form,  the  dynamic  evolution  of  an  FPS  is 
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entirely  described  by  an  array  of  probabilities  for  the  state  and  output, 
conditioned  on  the  previous  state  and  input.  Except  for  the  requirements 
that  the  input,  output,  and  internal  state  sets  be  finite,  and  that 
dynamics  be  stationary,  an  FPS  is  totally  unstructured . 

The  formal  definition  of  an  FPS  can  now  be  given.  *“  - 

(2.1)  Definition.  A finite  probabilistic  (dynamical)  system  (FPS)  is 
a 5-tuple  (U,Y,S , tt(0),  (P(y  (u)  : yeY,  ueU})  where: 

(i)  U is  a finite  nonempty  set  of  input  values  (or  decisions); 

(ii)  Y is  a finite  nonempty  set  of  output  values  (or  observations); 

(iii)  S = <1,N>  is  a finite  nonempty  set  of  (internal)  state  values; 

(iv)  tt(0)  is  a stochastic  N-vector  of  initial  state  probabilities ; 
(v)  Each  P(y|u)  is  an  NxN  substochastic  matrix  of  state  transition 
probabilities , and  £ P(y|u)  is  stochastic,  VuEU. 

y o i 

The  dynamic  evolution  of  an  FPS  is  described  in  the  following  terminology 

1.  The  initial  state  s(0)  assumes  value  i with  probability  tt^(O). 

2.  When  a decision-maker  specifies  input  u(k) , that  input  is  said 

to  be  accepted  by  the  FPS.  Output  y(k+l)  <s  subsequently  emitted 
by  the  FPS. 

3.  Given  that  an  FPS  in  state  s(k)=i  accepts  input  u(k)=u,  it 

will  undergo  a transition  to  state  s(k+l)=j  and  emit  output 
y(k+l)*y  with  probability  P^^ylu).  conditionally  independently  of 
th«  "past  history"  >kE«>,k-l>  • (u(k' > Vc<0,k-1>  ' 


V 


4.  The  Markov  decision  process  consisting  of  the  internal  state 
and  input  processes  of  an  FPS  is  called  the  underlying  pro- 
cess (of  that  FPS) . It  is  described  by  the  stochastic 
matrices  {Z  ^ P(y|u)  : uCU)  . 

5.  The  time  set  is  <0,K>  . The  terminal  time  K is  called  the 
horizon. 

b.  Alternate  Representations. 

The  expression  "finite  probabilistic  system"  is  used  in  accordance 
with  a classification  of  systems  by  Kalman,  Falb, and  Arbib  [1969].  The 
notation  used  to  specify  dynamics  for  a particular  FPS  is  that  of  Paz 
[1971].  It  is  also  called  the  Mealy  form  of  a FPS,  in  consideration  of 
its  similarity  to  the  Mealy  form  of  a deterministic  machine.  The  Moore 
form  is  an  alternate  representation  in  which  y(k)  is  expressed  as  a 
deterministic  function  of  s(k)  alone. 

Yet  another  representation  is  that  of  Drake  [1962].  Here  the 
transition  probabilities  of  the  underlying  process  are  provided,  along 
with  a matrix  of  conditional  output  probabilities,  given  internal  states. 

A transformation  to  Mealy  form  is  readily  effected,  although  some  care 
must  be  taken  to  insure  that  inputs,  outputs,  and  time  changes  are  defined 
to  occur  in  the  correct  order,  i.e.  that  y(k)  is  emitted  before  u(k)  is 


accepted . 
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c.  Some  Important  Classes  of  FPS's 

(2.2)  Definition.  An  FPS  is  state-observable  if  each  transition  pro- 
bability matrix  P(y|u)  has  at  most  one  non-zero  column. 

Interpretation : In  a state-observable  FPS,  the  internal  state  may  be 

deduced  from  the  most  recent  input-output  pair  alone. 

Example : A Markov  decision  process  is  a state-observable  FPS. 

(2.3)  Definition.  An  FPS  is  state-calculable  if  each  row  of  a transition 
probability  matrix  has  at  most  one  non-zero  entry. 

Interpretation : In  a state  calculable  FPS,  knowledge  of  the  previous 

internal  state,  along  with  the  intervening  input-output  pair,  is  suffi- 
cient to  determine  the  present  state. 

Example : Consider  a queuing  system,  in  which  only  the  numbers  of  arriving 

and  departing  "customers"  (over  each  discrete  time  interval)  are  observed. 
This  system  may  be  modeled  as  a state-calculable  FPS. 

(2.4)  Definition.  An  FPS  is  free  if  its  input  set  contains  exactly  one 


element . 
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Remark:  A free  FPS  may  be  viewed  as  a "partially-observable  Markov  chain" 

(Drake  [1962])  or  "stochastic  process  of  finite  rank  ' (Paz  [1971]). 

d.  Specification  of  the  Input  Process 

A rule  for  the  selection  of  inputs  to  an  FPS  will  be  called  a 
(decision)  strategy.  A strategy  Y is  specified  by  a probability  dis- 
tribution for  u(k)  conditioned  on  the  past  history  [s(0) ,u(0) ,y(l) ,s(l) , 
....  s(k-l) ,u(k-l) ,y(k) ,s(k) ] ; however,  this  representation  is  cumber- 
some. It  is  far  more  convenient  to  consider  the  input  process  to  be 
generated  by  a dynamical  system  called  a controller , which  is  a controlled 
Markov  process  having  input  and  state  sets  to  be  determined,  and  output 
process  (u(k)}  . 

A particular  description  of  a decision  strategy  as  a dynamical 
system  is  called  a realization  of  that  strategy.  Naturally  some  reali- 
zations are  more  concise  then  others.  A decision  strategy  satisfies  a 
finite-memory  constraint  if  it  has  an  FPS  realization  with  input  process 
(y(k-l)}.  In  this  report,  consideration  will  be  limited  almost  exclu- 
sively to  decision  strategies  that  can  be  realized  by  deterministic  time- 
invariant  finite-state  automata. 

The  interconnection  of  an  FPS  with  decision  strategy  y causes  the 
former's  input,  state, and  output  processes  to  become  stochastic  processes; 
the  resulting  system  may  or  may  not  be  an  FSP,  depending  on  the  size  of 
its  state  set  (which  must  include  all  information  required  to  describe 
future  inputs).  This  system  will  be  called  the  free  system  induced  (on 

V 

\ 
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the  FPS)  by  strategy  y , or,  more  informally,  the  system  under  y . 

If  y satisfies  a finite-memory  constraint,  then  the  system  under  y 
may  be  represented  as  a free  FPS  whose  state  is  a doublet  consisting  of 
both  the  plant  and  controller  states. 

The  output  process  of  a free  FPS  is  a stochastic  process,  since 
the  probability  distribution  of  system  variables  (states  and  outputs) 
is  well-defined.  Such  is  not  the  case  if  U contains  more  than  one  ele- 
ment: y(l)  then  depends  on  u(0) , which  is  not  a random  variable  (since 

no  probabisistic  rule  describing  it  has  been  provided) . The 
interconnection  of  an  FPS  with  a decision  strategy  y causes  all 
system  variables  to  become  random  variables.  A probability  measure, 
denoted  Prob,  , which  describes  these  variables,  is  specified  by  the 
induction: 

Prob  (s(0)=i}  = 7^(0). 

Y 

Prob^  {s(k')=s^,,  u(k')=u^,,  y(k'+l)=y^,,  Vk'e<0,k-1> 

and  s(k)=i,  u(k)=u,  y(k+l)=y,  s(k+l)=j} 

= Prob^  {s(k')=sk,,  u(k')=uk,,  y(k'+l)=yk,,  Vk'e<0,k-1> 

and  s(k)=i} 

• Prob  (strategy  y causes  u(k)=u  to  be  selected  | 

s (k ' )“sk , , u(k')»uk,,  y(k'+])=yk,,  k'c<0,k-l>  and  s(k)=i} 

• Ptj(y|u).  (2.5) 


\ 


\ 
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Informally , Prob^  is  called  the  probability  under  (strategy)  y. 

(2.6)  Definition.  E^{*}  denotes  expectation  with  respect  to  probability 
measure  Prob^,  {•}  , i.e.  expectation  given  that  inputs  are  selected 
according  to  strategy  y. 


Notation:  Subscript  y may  be  omitted  in  Prob^  {•}  and  {•}  when 

the  probability  or  expectation  is  the  same  for  all  strategies. 


e.  The  Information  Vector 

(2.7)  Definition.  The  stochastic  N-vector  h(k)  having  components 

HjOO  = Prob  (s(k)=i | u (0) . . . u(k-l);  y(l)...  y(k)} 

will  be  called  the  Information  vector  at  time  k. 

It  is  well  known  that  n(k)  is  a sufficient  statistic  for  the 
estimation  of  future  dynamics  given  past  inputs  and  outputs;  this  is  a 
trivial  result  of  the  Markov  property  of  the  internal  state.  The 
following  result  is  similarly  self-evident. 


(2.8)  Proposition.  The  information  vector  may  be  recursively  computed 
according  to  Bayes'  Rule: 

H(k+1)  = T(n(k) , u(k),  y(k+l)), 

where  T is  the  information  vector  transition  function 


\ 


T(n,u,y)  = nP(y|u)  / (nP(y|u)i) 


Because  o(k)  is  a sufficient  statistic,  desirable  decision 


strategies  may  be  realized  by  a deterministic  machine  having  state 
process  (n(k)}  . Such  a decision  strategy  would  be  completely 
described  by  a policy  on  fl^  , i.e.  a mapping  from  FI  to  U speci- 
fying the  input  to  be  applied  when  the  information  vector  has  a given 
value.  This  traditional  approach  to  controller  realization  leads 
to  horrendous  computational  difficulties  which  have  yet  to  be  resolved. 
The  main  contributions  of  this  research  are  approximation  schemes  for 
rtOO  , and  associated  realizations  which  avoid  the  use  of  II  as  an 
observer  or  controller  state  set. 

f.  Rewards  and  Performance  Indices 

It  is  convenient  to  place  a mechanism  for  evaluation  of  decision 
strategies  within  the  conceptual  confines  of  the  system  itself.  To 
this  end,  consider  the  process  of  incremental  (immediate)  rewards 
{r(k)}  , each  of  which  is  determined  from  system  variables  s(k),  u(k) , 
y(k+l),  s(k+l),on  the  basis  of  a given  array  {r[i,u,y,jj  : i,jeS, 
ueU,  yeY}  , according  to  the  rule 

r (k)  = r [ s ( k ) , u(k),  y(k+l),  s(k+l)] 


See  the  discussion. 


in  Section  4 


of  previous  work  in  this  field. 
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(2.9)  Definition.  A valued  finite  probabilistic  system  (VFPS)  is  an 
FPS  along  with  an  incremental  reward  array,  as  described  above. 


(2.10)  Definition.  The  performance  index  is  a function  of  the 
decision  strategy,  taking  one  of  the  following  forms: 

(a)  Finite  horizon: 

g((b(k)  }ke<()  K>  , Y)  = ey^^=0  b(k)  r(k)},  Kc<0,o°>  . 

(b)  Discounted  infinite-horizon: 

g(6,Y)  = (l-6)EY{Zk=0  Bkr(k)},  0<g<l. 

(c)  Undiscounted  infinite-horizon: 

g(Y)  = lim  lnfgtl  [g(0,Y) 3 • 


Remark:  The  undiacounted  performance  index  g(*)  is  generally  equivalent 

to  the  "time-averaged  reward"  lim  inf„  E {—  if  ^ r(k)}.  For  a 
discussion  of  the  conditions  under  which  these  indices  may  differ,  see 
Flynn  [1974].  The  definition  given  above  is  more  convenient,  especially 
when  relative  values  are  considered,  since  these  converge  as  6+1. 

The  incremental  reward  process  may  be  replaced  by  a process  of 
expected  Incremental  rewards  {q (k)  } defined  by 


s(k) 


q(k)  - q 


(u(k)) 


(2.11) 


where 
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qi(u)  = £jeS  EyeY  Pij(ylu)  r[i’u’y’jl 


(2.12) 


denotes  the  expected  reward  given  that  s(k)  = i and  u(k)  = u. 

Clearly  the  substitution  of  process  (q(k)}  for  { r (k) } in  (2.10) 
leaves  the  value  of  a performance  index,  for  a particular  decision 
strategy,  unchanged. 

Also  define 


'max 

= maxieS 

max  „ 
ueu 

[qt(u) ] 

'min 

= minieS 

minuCU 

lq.(u)] 

Q 

= Q 

max 

Qmin  ’ 

(2.13) 
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g.  Classification  of  Problems 

The  problems  of  interest  fall  into  three  categories.  The  first 
of  these  is  given  the  name  estimation.  The  finite-memory  estimation 
problem  is  to  learn  as  much  as  possible  about  the  current  internal 
state,  subject  to  a finite-memory  constraint.  Note  that  in  the  absence 
of  this  constraint,  the  problem  would  be  trivially  solved  by  computing 
the  information  vector  according  to  (2.8).  This  can  in  fact  be  accom- 
plished if  the  set  of  values  assumed  by  the  information  vector  is 
finite,  as  occurs  when  the  FPS  ib  Btflte-observable  or  when  a finite 

horizon  is  contemplated.  In  general,  however,  the  information  vector 

l 

r 

cannot  be  exactly  computed  on  the  basis  of  finite  memory;  the  greater 
the  memory  allowance,  the  better  the  approximation  will  be.  The 
problem  is  more  accurately  described  as  that  of  constructing  a sequence 
of  finite-memory  observers,  (i.e.  systems  accepting  plant  outputs)  that 
generate  successively  better  approximations  of  the  information  vector. 

A suitable  tradeoff  between  memory  size  and  estimator  quality  can  be 
made  by  the  designer  after  this  sequence  has  been  computed,  up  to  a 
maximum  acceptable  memory  size. 

The  second  problem  is  given  the  name  statistical  decision.  It  con- 
cerns a VFPS  in  which  the  transition  probability  matrices  do  not  depend 
on  u.  The  problem  is  to  maximize  a performance  index  of  the  form 
specified  in  (2.10).  This  problem  may  be  solved  by  constructing  a 
finite-memory  observer,  and  using  the  information  vector  approximation 
as  the  basis  for  decision-making.  A typical  statistical  decision 
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problem  is  to  guess  the  value  of  the  internal  state,  according  to  an 
array  of  rewards  (penalties)  for  correct  (incorrect)  decisions. 

The  third  problem,  that  of  control , is  to  determine  a decision 
strategy  which  optimizes  a performance  index,  necessarily  taking  into 
account  the  effect  of  current  decisions  on  future  plant  behavior  as 
well  as  future  estimation  accuracy.  The  Machine  Maintenance  and  Repair 
Problem  (1.1)  falls  into  this  category. 

Since  statistical  decision  is  a special  case  of  control,  these 
problems  are  collectively  referred  to  as  FPS  control  problems.  In 
such  problems,  as  in  estimation,  a finite-memory  optimum  may  not  exist. 
The  problem  is  then  to  construct  a sequence  of  controller  designs  in 
which  memory  requirements  increase  and  performance  improves,  approaching 
a supremum  feasible  value.  Note  that  the  problem  is  not  to  maximize 
performance  subject  to  a given  bound  on  memory  size:  such  a formulation 
may  lead  to  an  artificial  situation  where  the  performance  of  mixed 
(randomized)  strategies  exceeds  that  of  pure  (deterministic)  ones,  thus 
defeating  the  main  purpose  of  a memory  constraint,  which  is  to  limit 
controller  complexity. 
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3.  Illustration  of  the  Solution  Procedure 

The  Machine  Maintenance  and  Repair  Problem,  first  described  in 
(1.1),  will  now  be  precisely  formulated  as  an  undiscounted  infinite- 
horizon  FPS  control  problem,  and  solved  by  perceptive  dynamic  program- 
ming. The  solution  is  also  documented  (in  somewhat  greater  detail) 
in  Section  23a. 

a.  Problem  Formulation 

Consider  a single  machine  which  can  produce  a single  item,  the 
product , during  each  production  cycle.  The  machine  contains  two 
identical  components . subject  to  failure,  each  of  which  must  operate 
on  every  product.  Depending  on  the  status  of  the  machine,  the  product 
may  be  defective  or  nondefective.  There  are  four  control  alternatives 
(inputs)  available  during  each  production  cycle.  One  is  to  manufacture 
an  item.  The  second  is  to  manufacture  an  item,  and  then  to  examine  it, 
so  as  to  determine  whether  or  not  it  is  defective.  In  the  third 
alternative,  the  machine  is  dismantled  and  Inspected  (at  a cost);  any 
component  found  to  be  defective  is  replaced.  The  fourth  alternative 
is  to  replace  both  components,  whether  or  not  they  have  failed. 

Although  the  plant  would  appear  to  have  four  internal  states 
(each  of  two  components  is  operational  or  has  failed),  the  number  of 
states  can  be  reduced  to  three  if  it  is  recognized  that  the  order  in 


\ 
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whlch  components  fail  is  unimportant.  Thus  the  state  set  is  taken 
to  be : 


S = 


[ 1 : All  components  are  operational 
' 2 : One  component  has  failed 
/ 3 : Both  components  have  failed 


The  four  inputs  are: 


(1  : Manufacture \ 

2 : Examine  / 

13  : Inspect  { 

4 : Replace  / 

The  three  outputs  are: 

| 1 : No  information  | 

Y = { 2 : Non-defective  product  observed/ 

[ 3 : Defective  product  observed  J 

Probabilistic  rules  governing  the  breakdown  of  machines  have 
been  modeled  as  follows:  Both  components  are  initially  operational. 

There  is  a probability  of  0.1  that  an  operational  component  will 
fail  during  the  manufacture  of  a product,  independently  of  the 
component ' s age  and  the  condition  of  the  other  component.  If  a com- 
ponent fails  prior  to  or  during  the  manufacture  of  a particular  item, 
it  causes  that  item  to  be  defective  with  probability  0.5.  Thus  the 
initial  probability  vector  is  tt(0)  = (1,  0,  0),  and  the  transition 
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0.81 

0.18 

0.01 

P(1 1 1)  = 

0.00 

0.90 

0.10 

9 

0.00 

0.00 

1.00 

0.81 

0.09 

0.0025 

P (2 | 2)  = 

0.00 

0.45 

0.0250 

0.00 

0.00 

0.2500 

*"1 

0.00 

0.09 

0.0075 

P (2 | 3)  = 

0.00 

0.45 

0.0750 

0.00 

0.00 

0.7500 

1. 

0. 

0 

P (1 1 3)  = P(1 1 4) 

= 

1. 

0. 

0 

• 

1. 

0. 

0 

* - 

The  value  of  an  Item  produced  is  one  unit  if  it  is  nondefective, 
zero  units  otherwise.  The  cost  of  examination  is  0.25  units.  New 
components  cost  a unit  apiece,  with  an  additional  charge  of  0.5  units 
for  inspection.  Hence,  the  expected  incremental  reward  vectors  are: 


q (1) 


0.9025 

0.4750 

, q(2)  = 

0.6525 

0.225C 

, q(3)  = 

I — 

i n m 
O rH 

i 

, q(4)  = 

-2 

-2 

0.2500 

0.0000 

L_2-5J 

-2 

The  performance  index  is  undiscounted  profit  over  an  infinite  horizon. 

The  Markov  decision  model  for  machine  maintenance  was  introduced 
by  Drake  [1968].  The  numbers  used  here  were  originally  devised  by 
Smallwood  and  Sondik  [1973],  to  illustrate  a computational  algorithm 
that  solves  finite-horizon  FPS  control  problems. 


b.  Solution  Procedure 


A solution  to  this  problem  is  obtained  in  several  iterations.  In 
each  of  these,  a Markov  decision  problem  will  be  solved,  yielding  a 
controller  design,  as  well  as  bounds  that  contain  the  performance 
of  the  optimal  controller  and  that  of  the  design  most  recently 
obtained.  In  early  iterations  the  bounds  will  be  loose;  but  as  com- 
putations become  more  intricate,  the  bounds  will  become  closer; 
eventually  they  will  coincide. 

In  the  first  iteration,  assume  that  the  controller  knows  the  true 
value  of  the  internal  state  at  all  times.  (The  artificial  assumption 
that  a controller  has  the  ability  to  "see"  internal  states  by  means 
other  than  computation  based  on  system  outputs,  will  be  known  as  per- 
ception . ) A Markov  decision  problem  that  is  readily  solved  (e.g.  by 
Howard's  algorithm,  described  in  Howard  [I960])  results,  yielding  the 
optimal  policy,  relative  value  vector,  and  optimal  gain: 


/ 1 \ 1 

'2.517 

3)  . V = 

0.500 

V 

0.000 

This  will  be  called  a perceptive  solution.  Since  the  (perceptive) 
controller  which  achieved  the  gain  .5147  had  access  to  more  information 
than  will  be  available  in  reality,  it  follows  that  .5147  is  an  upper 
bound  on  feasible  performance. 

The  strategy  obtained  in  this  iteration  is  called  a perceptive 
strategy ■ It  might  also  have  been  feasible  if  the  optimal  input  had 
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been  the  same  for  all  states;  but  such  is  not  the  case;  and  so  it 
cannot  be  applied  in  practice.  However,  a feasible  controller 
realization  might  make  use  of  the  optimal  perceptive  strategy  in  the 
following  way:  a value  for  the  current  internal  state  is  guessed  and 

the  corresponding  optimal  input  is  applied.  Since  this  is  the  first 
iteration,  the  guess  must  be  made  of  the  basis  of  no  real-time  in- 
formation whatsoever.  Suppose,  for  example,  that  the  guess  is 
"state  = 1"  at  all  times.  Then  input  1 will  be  selected  at  all  times; 
both  machine  components  will  etffctllllaily  fail;  and  a gain  of  0.25 
results. 

On  the  basis  of  these  computations,  it  is  concluded  that: 

1)  The  optimum  feasible  performance  lies  between 
0.25  and  .5147; 

2)  There  is  a feasible  solution,  requiring  no  memory, 
which  achieves  a performance  of  0.25. 

In  the  second  iteration,  a new  internal  state  is  devised,  taking 
the  form: 


x(k)  = [s(k-l) , u(k-l) ,y(k) ] . 

Clearly  x(k)  is  the  state  of  a controlled  Markov  chain,  and  a new 
FPS  representation  may  be  devised  in  which  inputs,  outputs,  and 
rewards  remain  as  before,  but  the  internal  state  is  x(k)  at  time 
k (see  Brookes  and  Leondes  [1973]).  This  called  an  augmentation  of 
the  original  FPS.  Since  there  are  only  four  functionally 
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distinguishable  input-output  pairs,  these  may  be  coded  and  given  the 
representation  z(k),  according  to  the  following  table: 

u(k-l)  y(k)  z(~k) 

111 

2 2 2 

2 3 3 

3 14 

4 14 

Using  the  12  states  of  the  form  x(k)  = [s(k-l),  z(k)],  a new  Markov 
decision  problem  is  solved  to  obtain  a new  perceptive  solution.  How- 
ever, the  perception  is  "weaker"  this  time,  and  the  optimal  perceptive 
gain  decreases  to  .4945.  The  optimal  perceptive  strategy  is  again 
unfeasible,  and  a feasible  solution  will  be  constructed  by  guessing 
the  intetnal  state  delayed  by  one  t ime  unit,  the  guess  being  based  on 
knowledge  of  z(k).  For  example  the  state  guess  might  be  s(k-l)  = 1 
when  z(k)  = 1,2,4,  and  s(k-l)  = 3 when  z(k)  =3.  In  this  case  input 
1 will  again  be  selected  at  all  times,  and  the  feasible  gain  is  0.25. 

On  the  basis  of  these  computations,  it  is  concluded  that 

1)  The  optimum  feasible  performance  lies  between 
0.25  and  .4945; 

2)  There  is  a feasible  solution,  requiring  4 memory  states 
states,  which  achieves  a performance  of  0.25. 

In  subsequent  iterations,  x(k)  will  take  the  form  x(k)  = [s(k-£), 
z(k)]  where  z(k)  is  the  memory  state , a string  of  l most  recent 
z-coded  input-output  pairs.  The  rules  by  which  a memory  state  may  be 
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constructed  are  rather  complex,  so  for  the  moment  regard  the  memory 
state  during  iteration  n as  the  string  of  (n-1)  most  recent  z-coded 
input-output  pairs: 

jz(k)  = z(k+l-m)  z(k+2-n)  ...  z(k-l)  z(k) 

As  computation  proceeds,  the  bounds  on  feasible  performance  become 
closer  and  closer.  Intuitively,  this  occurs  because,  as  the  memory 
state  becomes  longer,  the  augmented  state  component  that  is  perceived 
or  guessed  is  an  internal  statd  wlLh  greater  delay,  whose  influence  on 
the  present  information  vector  is  weaker.  In  this  particular  problem, 
the  bounds  eventually  coincide.  On  the  ninth  iteration,  only  eight 
memory  states  are  "recurrent"  under  the  optimal  strategy,  and  for  each 
of  these,  the  optimal  input  does  not  depend  on  the  delayed  state  com- 
ponent of  the  augmented  state.  The  optimal  inputs  are  in  fact  given 
by  the  deterministic  sequence: 

{u(k)>  = {1,1, 1,1, 1,1, 1,3,  1,1, 1,1, 1,1, 1,3,  ...  } 

Eight  memory  states  are  required  to  realize  this  sequence,  using  a 
finite-state  automaton.  The  optimal  gain  is  g*  = .422. 

c.  Discussion. 

The  optimal  decision-making  strategy  is  remarkably  simple;  but 
this  is  merely  a consequence  of  the  peculiar  rewards  specified  in  this 
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particular  problem.  For  example,  first-iteration  computations  show 
that  the  performance  achievable  with  perfect  state  information  is 
.5147,  and  the  performance  achievable  on  the  basis  of  no  information 
whatsoever  is  .25.  Thus  the  value  of  perfect  state  information  is 
no  more  than  .2647.  Examination,  which  costs  .25  and  yields  little 
information  about  the  state,  appears  unlikely  to  be  useful;  on  the 
ninth  iteration,  this  option  will  be  eliminated  entirely.  Had  the 
cost  of  examination  been  lower,  or  the  information  acquired  through 
examination  more  useful,  the  solution  might  have  been  considerably 
more  complex,  requiring  thousands  of  controller  memory  states.  An 
optimal  solution  might  not  have  been  obtained  at  all. 

In  fact,  the  method  described  above  cannot  be  used  to  generate 

g 

a solution,  since  the  final  iteration  would  involve  a 3*4  -state 
Markov  decision  process!  The  algorithm  that  was  actually  used  to 
solve  the  Machine  Maintenance  and  Repair  Problem  is  described  in 
Section  22,  and  the  solution  obtained  is  reproduced  in  Section  23a, 
in  this  report. 

The  importance  of  perceptive  dynamic  programming  as  an  engineering 
tool  is  derived  from  the  outcome  of  early  iterations,  rather  than  the 
solution  itself  (if  any  is  obtained).  During  iteration  n,  two  quanti- 
ties of  interest  are  computed.  The  first  of  these, g°,  is  an  upper 
bound  on  performance  that  can  be  achieved  if  the  (n-1)  most  recent 
inputs  and  outputs  constitute  the  only  available  information  concerning 
the  (n-1)  most  recent  transitions,  although  states  further  delayed 
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raight  be  perfectly  known.  The  second,  hn,  Is  a lower  bound  on  the 
performance  that  can  be  achieved  if  decisions  are  made  on  the  basis 
of  the  (n-1)  most  recent  inputs  and  outputs  alone,  and  all  other 
information  is  discarded.  Consequently  gn-hn  is  an  upper  bound  on 
the  value  of  information  concerning  events  delayed  by  (n-1)  time  units. 

In  a practical  engineering  problem,  it  is  reasonable  to  assume 
that  there  exists  a way  to  measure  the  internal  state  exactly,  although 
the  cost  associated  with  such  a measurement  might  be  exhorbitant. 

When  gn-hn  remains  large  for  lnigc  tt , this  indicates  that  greatly 
delayed  perfect  state  information  remains  significantly  useful  for 
purposes  of  decision-making,  which  in  turn  suggests  the  option  of 
periodically  measuring  the  internal  state  exactly.  If  the  interval 
separating  perfect  state  measurements  is  large,  then  the  average  cost 
of  periodic  state  measurements  will  be  small,  controller  memory  will 
have  been  reduced  and  performance  enhanced.  On  the  other  hand,  if 
gn-hn  converges  rapidly  to  zero,  this  indicates  that  information 
sufficiently  delayed  is  of  little  value  in  decision-making,  and  that 
a near-optimal  strategy  having  reasonable  controller  memory  require- 
ments, can  be  constructed. 

d.  Summary 

Perceptive  dynamic  programming  is  a computational  procedure  that 
may  be  used  to  examine  problems  of  decision-making, under  uncertainty 
contraints, with  perfect  recall  of  all  information  previously  obtained. 


\ 
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This  Is  done  by  considering  a sequence  of  problem  approximations  in 
which  information  dealing  with  events  sufficiently  delayed  Is  either 
superceded  by  the  "perception"  of  delayed  state  values,  or  ignored. 
The  difference  between  performances  achieved  under  these  information 
constraints  establishes  a value  of  delayed  information  which  may  be 
compared  with  the  cost  of  periodic  state  measurements,  the  cost  of 
retaining  greatly  delayed  outputs  in  controller  memory,  and  the  cost 
of  continuing  the  design  procedure.  In  the  Machine  Maintenance  and 
Repair  Problem,  the  value  of  delayed  information  rapidly  approached 
zero,  and  an  exact  optimum  was  obtained. 


\ 
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4.  Historical  Perspective 

An  FPS  decision  theory  may  be  associated  with  several  disciplines. 
Some  of  these  are  listed  below,  along  with  representative  references; 
this  list  is  by  no  means  intended  to  be  exhaustive.  Since  an  FPS  is 
a probabilistic  automaton,  and  the  decision  strategy  is  represented 
as  a finite-state  machine,  the  study  of  FPS's  is  closely  related  to 
probabilistic  automata  theory ; see  Paz  [1971]  for  a summary  of 
recent  trends  in  this  field.  Since  the  assessment  of  unknown  state 
values  is  involved  in  decision-making,  a theory  of  FPS  decisions  is 
related  to  statistical  decision  theory  in  the  sense  of  DeGroot  [1970]. 
FPS  control  problems  are  problems  of  stochastic  control ; the  intro- 
ductory text  of  Kushner  [1971]  is  a standard  reference.  Analysis  of 
the  optimization  problem  in  an  appropriate  (infinite-dimensional) 
vector  space  makes  use  of  techniques  described  by  Luenberger  [1969]. 
Finally,  an  FPS  is  a dynamical  system;  its  study  therefore  belongs  to 
what  Kalman,  Falb,  and  Arbib  [1967]  describe  as  the  "exciting  but 
chaotic  new  field  of  system  theory." 

Most  of  these  disciplines  are  generally  considered  to  be  out- 
growths of  the  pioneering  work  of  Von  Neuman  and  Morgenstern  [1947]. 

A theory  of  statistical  decisions  was  subsequently  initiated  by  Wald 
[1950].  The  importance  of  the  concept  of  state  in  structuring 
sequential  decision  problems  was  enunciated  by  Richard  Bellman  [1957b]; 
he  devised  a general  mathematical  approach  called  dynamic  programming , 
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which  may  be. applied  to  the  optimization  of  sequential  decisions. 

The  finite-horizon  Markov  decision  problem  (Bellman  [1957a])  is  par- 
ticularly well-suited  to  solution  by  dynamic  programming;  also  see 
Howard  [1960],  Derraan  [1970],  Mine  and  Osaki  [1970],  Ross  [ 1970] , Howard 
[1971],  Hastings  [1973],  and  Bertsekas  [1976]. 

Because  Markov  decision  problems  can  be  solved,  and  because 
structural  properties  of  the  solution  are  fairly  well  understood,  a 
great  deal  of  effort  has  been  devoted  to  improving  the  algorithms 
employed.  Schweitzer  [1973]  has  complied  a list  of  hundreds  of 
publications  in  this  area.  Among  these.  Brown  [1965],  Lanery  [1967, 
1968],  Bather  [1971]  and  Schweitzer  and  Federgruen  [1977?]  have 
studied  convergence  properties  of  value  iteration,  which  is  regarded 
as  the  most  efficient  form  of  dynamic  programming;  see  Odoni  [1967] 
for  a comparison  of  convergence  rates  in  various  dynamic  programming 
forms.  The  basic  value  iteration  procedure  has  been  supplemented 
and  improved  in  many  ways:  D.J.  White  [1963]  introduced  a method 

for  normalizing  value  functions  in  order  to  avoid  divergence;  Odoni 
[1967,  1968]  generalized  a result  of  MacQueen  [1966]  to  obtain  a 
method  for  bounding  the  closeness  of  suboptimal  solutions  to  the 
optimum;  Schweitzer  [1971]  accelerated  value  iteration  by  adding  a 
damping  term;  Hastings [1976]  devised  a procedure  for  more  efficient 
enumeration  and  termination  when  the  optimum  has  been  reached;  the 
applicability  of  value  iteration  was  extended  by  Platzman  [1977]  who 
introduced  the  concept  of  connected  classes  in  Markov  decision 
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processes.  Value  iteration  is  currently  feasible  for  problems  with 
thousands  of  states  (Schweitzer  11971]). 

Partially-observable  Markov  decision  problems  have  been  studied 
by  Drake  [1962],  Astrora  [1965,  1969],  Sawaragi  and  Yoshikawa  [1970], 
and  other  as  noted  below.  In  each  case,  the  problem  was  regarded 
as  one  of  decision-making  with  perfect  state  information,  considering 
the  information  vector  to  be  the  state  of  a transformed  system.  How- 
ever, the  number  of  values  which  may  be  assumed  by  the  information 
vector  is  infinite.  Thus  the  problem  becomes  one  of  dynamic  programming 
on  the  unit  simplex  II  (an  infinite  state  set),  and  describing  an 
optimal  decision-making  policy, which  is  a finite-valued  function  on 
1^.  Kaklik  [1965]  approximated  the  unit  simplex  by  a finite  grid  of 
evenly  spaced  points;  needless  to  say,  the  method  failed  to  be  practi- 
cal for  all  but  very  small  problems.  Sondik  [1971]  (in  research  also 
reported  by  Smallwood  and  Sondik  [1973])  established  piecewise- 
linearity  of  the  value  function  and  finite-memory  realizability  of  the 
optimal  strategy  in  finite-horizon  problems;  however  this  too  fails  to 
be  feasible  if  the  number  of  faces  on  the  value  function  is  large. 
Existence  of  solutions  to  discounted  problems  was  established  by  Sondik 
[1971]  and  by  Satia  and  Lave  [1973].  C.C.  White  [1976]  has  shown  that 
these  results  are  also  applicable  to  a class  of  partially-observable 
semi-Markov  decision  models  that  are  externally  indistinguishable 
from  a discrete-time  partially-observable  Markov  decision  process. 
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Existence  of  finite-memory  solutions  to  certain  infinite-horizon 
problems  had  been  noted  by  Drake  [1962,  1968].  In  the  context  of 
statistical  decision  on  a noisy  Markov  channel,  this  work  has  been  pur- 
sued by  Sulmar  [1974]  and  Devore  [1974].  Sondik  [1971]  provided  an 
intuitive  explanation  for  this  phenomenon;  his  work  inspired  the  de- 
finition of  detectability  in  the  present  research.  Similar  results, 
regarding  the  near-sufficiency  of  a finite  string  of  most  recent 
observations,  have  been  obtained  by  Cerny  [1969]  and  Kajser  [1975]. 
Systems  with  perfect  but  delayed  state  observations  were  introduced  by 
Brookes  and  Leondes  [1973]. 

Finite-memory  hypothesis-testing  and  N-armed  bandit  problems  have 
been  studied  by  Cover  and  Helman  [1970],  Heilman  and  Cover  [1970a], 

Cover,  Freedman,  and  Heilman  [1976],  and  others  noted  both  in  these  re- 
ferences and  in  DeGroot  [1970].  One  may  observe,  from  the  titles  in 
subsequent  correspondence  between  Chandresekarin  [1970,  1971]  and  Heilman 
and  Cover  [1970b],  that  there  is  some  controversy  over  the  meaning  of 
this  problem.  Chandresekarin  and  Lam  [1971]  have  subsequently  proposed 
an  alternative  formulation.  The  issue  involved  is  the  manner  in  which 
memory  should  be  allowed  to  increase  as  performance  approaches  its 
supremum  value . Similar  issues  arise  in  the  solution  of  FPS  control  pro- 
blems; they  are  discussed  in  Section  20  of  this  report. 
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5.  Outline  of  Original  Contributions 

The  aim  of  this  research  is  to  construct  finite-memory  observers, 
to  devise  a method  for  bounding  the  value  of  information  in  decision- 
making, and  to  establish  a feasible  computational  procedure  for  the 
design  of  e-optimal  finite-memory  controllers.  Such  results  are 
meaningful  only  when  supplemented  by  mathematical  machinery  which 
justifies  their  validity.  This  section  provides  an  heuristic  inter- 
pretation of  concepts  and  intermediary  results  that  are  introduced 
for  the  first  time  in  this  report,  and  which  contribute  significantly 
to  an  understanding  of  the  main  results. 

a.  Ill-posedness  of  certain  undiscounted  infinite-horizon  problems 

Consider  a "dual  control"  problem  described  by  the  VFPS : 

Y - {1,2}, 

U » {0,1,2}, 

7T (0)  = (-5,  .5), 

N * 2 , 


q (0)  = 


(5.1) 


-47- 


The  inputs  may  be  assigned  the  meanings: 

( 0 : Obtain  a measurement  ) 

U = | 1 : The  state  is  probably  1 • 

I 2 : The  state  is  probably  2) 

The  outputs,  likewise,  are  interpreted  as: 

Y = ( 1 : The  state  remained  unchanged  I 

| 2 : The  state  changed  j- 

It  is  clear  that  use  of  input  0 causes  the  information  vector  to 
approach  a unit  vector,  and  use  of  inputs  1 or  2 causes  the  values  of 
information  vector  entries  to  remain  unchanged.  Hence,  when  input  0 
is  used,  information  is  gained,  but  no  reward  is  received;  when  inputs 
1 or  2 are  used,  a reward  is  received,  but  no  information  is  gained. 

If  a discounted  performance  index  is  considered,  then  use  of 
input  0 will  eventually  be  discontinued.  This  is  true  because  a 
decision-maker  in  information  state  (l-e,e)  stands  to  gain  no  more 
e/(l-8)  by  seeking  further  information,  and  receives  an  expected  re- 
ward of  1-c  if  he  forgoes  further  information.  As  (J-+-1,  the  point  at 
which  use  of  input  0 is  discontinued  becomes  more  and  more  distant. 

In  the  undiscounted  case,  the  value  of  perfect  state  information  (i.e. 
a unit  information  vector)  is  Infinite,  relative  to  the  value  of  any 
information  vector  that  is  not  a unit  vector.  A decision-maker  con- 
fronted with  an  infinite  horizon  will  therefore  choose  input  0 at  all 
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times.  Consequently,  he  will  receive  no  reward  at  all.  E.  Denardo 
calls  this  "the  infinitely-delayed  splurge." 

The  infinitely  delayed  splurge  may  be  avoided  in  a number  of  ways. 
One  way  is  to  consider  only  discounted  performance  indices.  Another 
is  to  assume  that  the  decision-maker  has  access  to  an  infinite  past; 
he  will  then  know  the  initial  state  exactly.  However,  i_t  does  not 
suffice  to  require  that  the  underlying  process  be  ergodic.  In  this 
problem,  the  internal  state  process  consisted  of  independent  Bernoulli 
trials;  and  yet  the  infinitely  delayed  splurge  occurred. 

b.  Sufficient  conditions  for  wel 1-posedness 

Two  conditions  which  (together)  are  sufficient  to  assure  well- 
posedness  of  an  undiscounted  infinite-horizon  FPS  control  problem  are 
now  identified.  The  first,  reachability , is  a generalization  of 
connectivity  in  Markov  decision  processes.  In  a reachable  FPS,  it 
is  possible  to  select  a finite  sequence  of  Inputs,  on  the  basis  of 
the  information  vector  alone,  so  that  the  probability  of  entering  a 
specified  state  is  greater  than  1-p,  where  p is  the  reachability 
index.  If  P“0,  then  there  are  reset  actions  that  cause  the  state  to 
assume  any  desired  value  with  probability  one.  As  p increases  to 
1,  it  becomes  more  difficult  to  reach  a desired  state.  If  p=l,  then 
the  FPS  is  not  reachable.  Reachability  is  also  parameterized  by  SL ^ , 
an  upper  bound  on  the  number  of  transitions  required  to  "reach"  a state. 


V 

\ 
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It  will  be  demonstrated  that  the  state  set  of  any  FPS  may  be 
decomposed  into  connected  classes , along  with  a (possibly  empty)  set 
of  transient  states . Within  any  connected  class,  the  FPS  will  be 
reachable.  The  underlying  process  of  a reachable  FPS  "looses  memory" 
as  it  proceeds  forward  in  time,  in  the  sense  that  unconditional  state 
probabilities  in  the  future  depend  less  and  less  on  the  present  state. 

The  second  condition  has  been  given  the  name  detectability.  In 
a detectable  FPS,  the  information  vector  is  increasingly  insensitive 
to  increasingly  delayed  information,  such  as  inputs,  outputs,  or 
artificially  perceived  states.  A more  precise  definition  of  detec- 
tability is  deferred  to  section  5d,  where  appropriate  metrices  and 
contractions  will  be  introduced.  Detectability  is  characterized  by 
parameters  £ and  0 £ a < 1,  where  information  concerning  events 
delayed  by  £ time  units  causes  the  information  vector  to  vary  by  a 
distance  not  exceeding  a,  on  the  average.  If  a*0  then  information 
sufficiently  delayed  is  of  no  value  in  decision-making.  If  a is  close 
to  1,  then  information  greatly  delayed  is  important  in  decision  making, 
and  conversely,  the  present  decision  will  affect  many  decisions  to 
come.  If  a»l,  then  the  FPS  is  not  detectable. 

It  will  be  demonstrated  that  the  information  state  set  of  an  FPS 
can  be  decomposed  into  detectable  classes,  along  with  a (possibly 
empty)  set  of  null-recurrent  information  states.  The  information 
process  of  a detectable  FPS  thus  looses  information  as  it  is  viewed 
backward  in  time,  in  the  sense  that  the  present  information  vector 
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depends  less  and  less  on  state  values  from  the  increasingly  distant 
past . 

The  conditions  of  reachability  and  detectability  are  comple- 
mentary, in  a manner  similar  to  controllability  and  observability  in 
linear  systems. 


c.  A Bound  on  the  Value  of  Information 


A key  result.  Theorem  (19.3),  states  that  any  infinite-horizon 
FPS  control  problem  satisfying  conditions  of  reachability  and 
detectability  has  a convex  relative  value  function  v*(*)  satisfying: 


(£  +£)Q 

max7Tp  77  {v*(TT)}-min  {v*(tt)}<  —2 — = ft  (5.2) 

N ^ (l-p)(l-a) 


where  Q is  given  by  (2.13).  The  expression  on  the  right  of  (5.2)  is 
interpreted  as  the  bound  on  the  value  of  information,  v*  may  become 
undefined  as  p + 1 or  a > 1. 


d.  Metrics  and  Contractions 

Consider  6 [tt  ,tt * ] = (1T^“T,|)+,  the  Hajnal  measure , which  is 

extensively  used  (as  described  in  Paz  [1971])  to  demonstrate  convergence 
of  unconditional  probability  vectors,  in  the  theory  of  ergodic  Markov 
chains.  A more  appropriate  metric  for  the  study  of  conditional  pro- 
bability vectors  is 


» 

\ 

V 
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A[tt,tt']  = s up  { 6 [ tt ow , tt * ow ] : wtll^  , w>0} 

where  flow  is  a vector  in  11^  having  elements  (now)  ^ 
It  will  be  shown  that: 


(5.3) 


n .w. /L 
1 x 


. ,n .w. , 
its  i 1 


6[n,n' ] < A[n,n' ] < 1 


and 


A[n,n'] 


l-/c  c 
12 


1+/ c ^ c 2 


where : 


c.  = rain{n./n!)  : leS,  n'  > 0}, 

1 li  1 

c2  = minin'. /n^)  : ItS,  t > 0}. 


(5.4) 

(5.5) 


(5.6) 


The  topology  induced  by  A on  n has  many  interesting  properties  which 
are  explored  in  Section  12d.  For  example,  any  convex  function  is  con- 
tinuous with  respect  to  A;  in  particular: 

v[n]  - v[n  ' ] <_  A[n  ,n ' ]4  (max  „ (v[n]}  - min  „ {v  [n]  > ] (5.7) 

N TCllN 

Now  consider  an  input-output  pair  (u,y)  such  that  P(y|u)  is 
■subrectangular . i.e.  l’^(y|u)  > 0 and  t , (y | u)  > 0 implies 

p i j * I > 0 and  pi<j(ylu)  °*  Let 

i i ' 

a[(u,y)]  = maXj  . ,(.,,A[T(e  ,u,y),  T(e  ,u,y)]. 

Now  0 < oc[(u,y)J  < 1,  a consequence  of  the  subrectangular ity  of  P(y|u). 


I 

\ 


The  contraction  property  is: 


A[T(n,u,y) , T(n',u,y)]  < a(u,y)  A[n,n']  (5.8) 

This  is  illustrated  in  Figure  5-1.  It  is  seen  that  (u,y)  causes 
the  unit  simplex  to  be  mapped  into  a somewhat  smaller  set.  The 
greater  the  number  of  recent  input-output  pairs  available,  the  smaller 
this  set  will  be.  Hence,  the  assumption  that  the  information  vector 
£.  times  delayed  had  some  convenient  value,  allows  an  approximation 
of  the  information  vector  to  bh  i tiH1{itlted  on  the  basis  of  the  most 
recent  £ input-output  pairs  alone.  This  approximation  is  guaranteed 
to  be  with  a certain  distance  of  the  true  value;  that  distance  can 
be  computed  by  measuring  the  contraction  imposed  on  the  information 
vector  by  the  transition  probability  matrix  corresponding  to  the  most 
recent  i input-output  pairs. 


3 


1 


3 


a 


3 


Figure  5-1. 


Contractions  on  the  Unit 


Simplex 
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In  the  establishment  of  detectability,  subrectangularity  plays 
a role  analogous  to  that  of  block  rectangular ity  In  the  establishment 
of  connectivity  in  Markov  chains.  An  FPS  satisfies  a condition  of 
strong  detectability  if  there  is  an  integer  2 such  that , for  every 
possible  sequence  of  consecutive  input-output  pairs  (u^  ,y^) (u^  ,yo) 

...  (u^  ,y^)  , the  cumulative  transition  probability  matrix  P(y^|u^) 

• P(y?|u9)  • . ..P(y^Jup)  is  subrectangular . It  follows,  from  the 
contraction  property  stated  above,  that  an  estimate  of  the  information 
vector  can  be  made  arbitrarily  close  (in  a A sense)  by  recalling  a 
sufficiently  long  string  of  recent  input-output  pairs.  In  particular, 

an  estimate  made  on  the  basis  of  2 input-output  pairs  always  lies 

Z'.J 

within  a of  the  true  information  vector,  for  some  a<l. 

Weak  detectability  is  a condition  which  implies  that  the  expected 
deviation  of  the  information  vector  estimate  from  its  true  value  can 
be  made  arbitrarily  small  in  an  analogous  way.  In  a weakly  detec  table 
system,  a denotes  the  average  contraction  induced  by  the  most  recent 

2 input-output  pairs.  The  average  contraction  induced  by  the  most 

— 2 : "2  — 

recent  2 pairs  is  now  given  by  a . a is  a measure  of  detect- 
ability which  differs  slightly  from  a . 

< . Existence  of  e-optimal  Controllers 

Consider  the  relative  value  function  for  a reachable,  detectable, 
FPS.  I'  will  be  seen  that  this  function  spans  a range  of  values  which 
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Thus , for  any  stochastic  vectors 


a +£)q 

cannot  exceed  ft  = — 

(l-p)(l-a) 

|v*[7T]  — V*[7T*]  | _<  4ft  . 


When  state  perception  is  introduced,  the  information  vector 
changes,  at  any  given  time,  in  such  a way  that  the  expected  relative 
value  of  the  new  information  vector  will  be  greater  than  that  of  the 
old  information  vector.  The  difference  between  these  quantities,  called 
the  value  o_f  perception , is  shown  in  Figure  5-2.  If  perception  of 
states  with  an  £ time-unit  delay  is  assumed,  then  the  gain  will 

—0}T  _£ ;X 

increase  by  at  most  a 4ft  = a 


4(£  +I)Q 
P . 

(l-p)d-I) 


The  substitution  of  guessed  state  values  for  perceived  states  is 
called  pseudo-perception.  If  a delayed  state  value  is  guessed,  then 
the  controller  finds  itself  acting  according  to  one  information  vector 
while  actually  in  another  information  state.  The  value  of  acting 
according  to  a particular  information  vector  is  linear  in  the  actual 
information  state,  because  E{value  of  acting  according  to  n | n ( k ) } 

= n ^ (k)E(value  of  acting  according  to  n^|s(k)  = i}.  Thus  the  cost 

of  pseudo-perception  is  as  shown  in  Figure  5-3;  this  cost  cannot 

4l(£( +l)Q 

(l-p)(l-I)2 


-VI 


4£ft 

( 1-a) 


-Vi 

a 


exceed 
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Improvement 
due  to 
Perception 


State  1 State  2 


Figure  5-2. 


Geometric  Interpretation  of  Performance 
Increase  Due  to  Perception 


Deterioration 
due  to  Pseudo- 
Perception 


Delayed  Perception 
of  State  1 


Figure  5-3.  Geometric  Interpretation  of  Performance 
Decrease  Due  to  Pseudo-perception 
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An  intuitive  justification  of  these  expression  is  provided  by  the 

following  argument.  Consider  an  FPS  where  = £ = 1.  Then  it  costs 

Q/(l-p)  units  to  reach  a desired  state,  if  it  is  assumed  that  the 

state  is  perfectly  observed.  This  is  true  because  Q is  the  cost 

(per  unit  time)  of  being  in  an  undesirable  state  instead  of  being 

in  a most  desirable  state,  and  because  the  expected  number  of 

transitions  required  to  reach  the  most  desirable  state  is  l/(l-p). 

Suppose  now  that  state  uncertainty  is  introduced.  Then  the 

uncertainty. caused  when  the  most  recent  state  perception  occured  £ 

_ £ 

time  units  ago,  is  a . Thus  the  value  of  a single  perception, 
delayed  £ time  units, is 

ae[4Q/ (1-p)  1 + aW[4Q/(l-p)]  + ...  = 

The  cost  of  pseudo-perception  is  similarly  derived,  resulting  in 
an  additional  factor  of  (l-oi)  in  the  denominator. 

f.  Feedback  Realization  of  g-optimal  Controllers 

The  definition  of  an  FPS,  given  in  Section  2a,  is  structural 
rather  than  functional.  Much  of  the  detail  provided  in  the  specifi- 
cation of  a particular  FPS  is  irrelevant  to  an  observer  who  has  access 
only  to  inputs  and  outputs.  For  example,  the  internal  states  of  an 
FPS  may  be  reordered  (by  means  of  suitable  row  and  column  manipulations 


4Q 

(l-p)d-I) 
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on  the  initial  state  probability  vector  and  transition  probability 
matrices)  to  obtain  a new  system  which  cannot  be  distinguished  from 
the  first  on  the  basis  of  input-output  histories  alone.  Two  or  more 
FPS's  which  are  indistinguishable  in  this  sense  will  be  called  equi- 
valent . 

A valued  f ini te  probabilistic  system  (VFPS)  was  defined  as  an  FPS, 
along  with  a reward  structure  which  allows  a performance  to  be  assigned 
to  any  control  strategy.  If  two  or  more  VFPS's  consist  of  equivalent 
FPS's,  along  with  reward  structures  that  result  in  identical  performance 
indices,  these  VFPS's  will  be  called  equivalent . 

The  problem  under  consideration  is  to  compute  a control  strategy 
that  optimizes  the  performance  index  corresponding  to  a particular  VFPS. 
The  concept  of  equivalence  is  used  to  transform  this  problem  into  one 
that  is  more  easily  solved:  it  suffices  to  compute  a strategy  which 

optimizes  the  performance  index  corresponding  to  any  particular 
equivalent  VFPS. 

A convenient  equivalent  VFPS  is  constructed  by  a procedure  known 
as  augmentation . Any  augmented  VFPS  is  completely  described  by  the 
original  VFPS  from  which  it  was  obtained,  and  a memory  set,  M,  which 
is  a finite  set  of  strings  of  input-output  pairs.  An  observer  is 
required  to  select,  from  the  memory  set,  the  element  that  correctly 
lists  the  largest  number  of  most  recent  input-output  pairs;  this  is  called 
the  memory  state.  An  augmented  s^taj^e  consists  of  the  internal  state 
delayed  by  a quantity  equal  to  the  length  of  the  memory  state,  along 
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with  the  memory  state  itself.  Since  the  augmented  state  may  be  regarded 
as  the  state  of  a controlled  Markov  chain,  an  equivalent  VFPS  having 
augmented  internal  states  in  place  of  internal  states  may  be  constructed. 
This  VFPS  is  the  outcome  of  augmentation  induced  by  M. 

An  example  of  augmentation  may  be  found  in  Section  3.  During  the 
n-th  iteration,  a memory  set  containing  all  strings  of  (n-1)  input- 
output  pairs  is  employed.  Thus  the  memory  state  consists  of  the  (n-1) 
most  recent  input-output  pairs,  and  the  augmented  state  consists  of 
the  true  internal  state  delayed  by  (n-1)  time  units,  along  with  the 
string  of  all  intervening  input-output  pairs. 

The  perceptive  or  feasible  strategy  computed  during  an  iteration 
of  perceptive  dynamic  programming  determines  inputs  on  the  basis  of 
the  current  augmented  state  alone,  and  thus,  it  may  be  viewed  as  a 
feedback  strategy . This  implies  that  the  system  under  such  a strategy 
is  a Markov  chain,  a fact  that  is  useful  in  evaluating  feasible  per- 
formances. 


6.  Organization  of  the  Report 


Mathematical  tools  for  the  analysis  of  FPS's  are  introduced  in 
Chapter  II.  A brief  outline  of  this  chapter  is  given  below.  The 
notation  to  be  used  in  representing  strings  of  input-output  pairs  is 
presented  in  Section  7.  The  concepts  of  "memory  state"  and  "augmenta- 
tion" are  made  precise  in  Sections  8 and  9.  In  the  computational 
technique  of  perceptive  dynamic  programming,  it  is  assumed  that  the 
augmented  state  (induced  by  some  memory  set  M)  can  be  "perceived"  by 
the  controller;  dynamic  programming  then  yields  a rule  for  optimal 
(perceptive)  decision-making,  expressed  as  a policy  on  the  augmented 
state  set.  However  the  performance  index  is  a function  of  strategy , 
or  rule  for  decision-making  on  the  basis  of  all  past  inputs,  states 
and  outputs.  The  relationship  between  a strategy  and  the  policy  which 
realizes  it  is  made  precise  in  Section  10.  Connectivity  and  reach- 
ability are  defined  in  Section  11.  It  is  demonstrated  that  both  pro- 
perties are  preserved  when  the  state  is  augmented.  Sections  12  and 
13  provide  the  basis  for  definition,  in  Section  14,  of  detectability. 
This  involves  the  development  of  appropriate  metrics  and  contractions, 
as  discussed  in  Section  5d . Solutions  to  the  finite-memory  estimation 
problem  are  then  introduced.  The  final  sections  of  Chapter  II  are 
concerned  with  applicability  of  perceptive  dynamic  programming.  In 
Section  15,  it  is  shown  how  any  free  FPS  can  be  decomposed  into 
detectable  parts;  thus  perceptive  dynamic  programming  can  always  be 
applied  to  each  detectable  component  of  the  problem.  Section  16 


establishes  that  very  few  FPS's  are  equivalent  to  a state-calculable 
FPS;  were  this  not  so,  many  FPS  control  problems  could  be  solved  by 
dynamic  programming  alone. 

Chapter  III  is  devoted  to  a study  of  the  structure  of  optimal 
controllers.  The  finite-horizon  and  state-observable  cases  are 
reviewed  in  Sections  17  and  18.  It  is  then  demonstrated,  in  Section 
19,  that  (under  suitable  assumptions)  an  optimal  strategy  will  exist, 
although  it  may  require  infinite  memory.  In  some  cases,  however,  the 
notion  of  an  undiscounted  ill  1 i liite  horizon  is  ill-defined,  and  the 
problem  is  meaningless.  An  alternate  formulation,  in  which  irregular 
features  are  constrained  to  finite-horizon  consideration,  is  proposed 
in  Section  20. 

Any  optimal  controller  which  requires  infinite  memory  cannot, 
in  general,  be  described  exactly.  Chapter  IV  introduces  a computa- 
tional technique  which  allows  the  optimal  performance  to  be  approached 
as  a memory  constraint  is  weakened.  This  technique,  called  perceptive 
dynamic  programming,  approximates  the  problem  as  a Markov  decision 
problem  solvable  by  dynamic  programming.  The  approximation  is  obtained 
by  means  of  an  assumption  that  delayed  state  values  can  be  artifically 
"perceived."  Like  dynamic  programming,  perceptive  dynamic  programming 
is  a general  approach  which  can  be  realized  in  many  ways;  these  are 
discussed  in  Section  21.  Results  obtained  by  implementation  of  a per- 
ceptive dynamic  programming  algorithm  are  then  presented:  a solution 

to  the  Machine  Maintenance  and  Repair  Problem,  and  an  analysis  of  a 


computer  communication  problem. 

Peripheral  ideas,  and  conjectures  regarding  potential  extentions 
of  the  theory,  have  been  collected  in  Chapter  V. 

A symbol  table  and  glossary  are  provided  to  assist  the  reader 
in  assimilating  the  terminology  and  notation  of  Chapter  II. 
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CHAPTER  II 

ANALYSIS  OK  FINITE  PROBABILISTIC  SYSTEMS 
7.  Input-output  Words 

Because  strings  of  input-output  words  play  a most  important  role 
in  the  analysis  of  FPS's,  it  is  essential  that  a compact  notation  be 
developed  for  their  representation.  Such  a notation  is  introduced  in 
this  section. 

(7.1)  Notation.  A finite  string  a = a^...  a.  of  elements  in  set 

A is  called  a word  qve_r  A.  Words  are  always  identified  by  underscores. 
The  set  of  all  words  over  A is  denoted  A*.  f(a)is  the  length  of  word 
a.  £ is  the  empty  word  (over  any  set).  If  a = a^...  a^,  and 
a'  = a'...  a,'  then  a a'  = a,...  a. a'...  a'  is  called  the  concatenation 
of  £ with  a';  clearly  a = a £ = £ a for  any  word  a.  If  A and  B are 

sets,  then  the  concatenation  AB  denotes  the  set  of  words  of  the  form  a b 

£ 

where  acA  and  bcB.  A is  the  set  of  words  consisting  of  exactly  £ con- 

£* 

secutive  elements  in  A;  A is  the  set  of  words  consisting  of  up  to  £ 
consecutive  elements  in  A. 

(7.2)  Definition.  Z denotes  the  set  of  input-output  pairs  (u,y)  such 
that  P(y|u)  / 0. 


I 

\ 

\ 
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Remark:  More  generally,  Z may  be  defined  as  the  set  of  equivalence 

classes  of  input-output  pairs  corresponding  to  identical  non-zero 
transition  probability  matrices.  The  tabulation  of  Z in  Section  3 is 
consistent  with  this  alternate  definition. 

(7.3)  Notation.  The  following  objects  will  be  used  interchangeably: 

1)  a word  over  Z,  i.e.  a string  of  pairs  (u^.y^)...  Cug^»y^) » and 

2)  a pair  of  words  over  U and  Y,  respectively,  having  equal  length, 
i.e.  (u,£)  = (u^..  u£>  y 1-..  yq).  In  a free  FPS,  the  input  component 
of  an  input-output  pair  may  be  omitted. 

(7.4)  Definition.  For  z = (ux *3^)  (u2 «Y2)  •••  cZ* ’ deflne 

P(z)  = P(y1|u1)  • P(y2|u2)  •...•  P(yjua). 

Also  P(e)  is  the  NxN  identity  matrix. 

Interpretation:  P^  (z)  = P^Uu.y))  is  the  probability  that  the  FPS 

will  emit  output  word  £ and  8°  to  state  j,  given  that  it  had  been  in 
state  i and  that  input  word  u was  subsequently  accepted. 

(7.5)  Definition.  (a)  I (z)  = {ieS  : P^  (z)i»0,  some  jcS} 

J(z)  = { jeS  : some  ieS} 


\ 


(b) 
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Interpretatlon:  I(z)  is  the  set  of  states  that  may  preceed  the  evolution 

of  input-output  word  z\  JOz)  is  the  set  of  states  that  may  follow  it. 

(7.6)  Definition.  (a)  Z+  = (zeZ*  : P(z)fO} 

(b)  Z+(tt1,ti2,  ...  ) = (zeZ*  : T^PCz)*),  tt2P  (z)^0 , ...} 

Interpretation : Z+  is  the  set  of  input-output  words  that  might  eventually 

112 

evolve.  Z (it  ,Tr  , ...)  is  the  set  of  input-output  words  that  might 

evolve  when  the  information  vector  equals  and  also  might  evolve  when 

2 

the  information  vector  equals  it  , etc. 

The  information  vector  transition  function  was  defined  in  (2.8)  for 
a one-step  transition,  i.e.  the  case  where  the  information  vector  is 
updated  as  soon  as  a single  input-output  pair  becomes  available.  It  is 
possible  to  generalize  this  transformation  to  the  case  of  a multiple- 
step  transition. 

(7.7)  Definition.  For  any  , zeZ+(n) , 

T(n,z)  - nP(z)/(nP(z)i). 

(7.8)  Lemma.  If  z z'  C Z+(n) » then 

T(n,z  z')  - T(T(n,z) ,z/). 
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8.  Memory  Sets  and  Memory  States 

This  section  makes  precise  the  notion  of  a memory  set,  (a  voca- 
bulary of  recent  input-output  pairs),  and  a memory  state  (a  summary, 
not  necessarily  complete,  of  recent  input-output  pairs,  lying  in  the 
memory  set).  Appropriate  notation  is  first  introduced. 

(8.1)  Definition.  z_(k^;k2)  denotes  the  word  of  input-output  pairs 
that  evolved  between  times  k^  and  k^  . Specifically: 

z(ki;k2)  = ((uCkjJ.yC^+l))  (u(k1+l),y(k1+2))...  (u(k2-l)  ,y  (k2) )) 

★ 

(8.2)  Definition.  (a)  "<"  denotes  the  partial  order  on  Z defined  by 

z'  <_  z if  3 z"eZ  such  that  z'  = z_, 

* 

(b)  If  M is  a finite  nonempty  subset  of  Z that 

is  totally  ordered  by  "<",  then  max[M]  denotes 
the  unique  element  z of  M for  which  there  holds 

£ VztM;  min[M]  is  analogously  defined. 

* * , 

(c)  If  zeZ  , then  trunc[z]  = {zcZ  : _z  £ z_i . 

(d)  If  z ' z,  then  z - z ' = z"  where  £'z"  = z_- 

Interpretation:  Recall  that  z^  is  a word  (i.e.  a string)  of  input-output 

pairs.  z1  < z is  used  to  indicate  that  z can  be  split  into  two  parts 
so  that  z’  matches  the  rightmost  part,  z = max[M]  is  a word  in  M having 
the  property  that  all  words  in  M are  rightmost  substrings  of  z_.  min[M] 
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is  a word  in  M which  is  a rightmost  substring  of  every  (other)  word  in 
M.  trunc[zj  is  the  set  of  rightmost  substrings  of  £,  i.e.  truncated 
versions  of  z.  z-z,'  is  what  remains  when  the  rightmost  substring  z1 
is  removed  from  z. 

(8.3)  Lemma.  truncU]  is  a finite  nonempty  set  which  is  totally  ordered 

by  and  e e truncfzj. 

It  is  now  possible  to  formulate  the  following  definition: 

•k 

(8.4)  Definition.  A memory  set  M is  a finite  nonempty  subset  of  Z 
which  satisfies 

(i)  M = (J  u trunc[z] 

Z£M  — 

and 

(ii)  M C [MZH{e}]  . 

The  memory  state  induced  by  M at  time  k is 
z^M(k)  = max  [MH  trunc  [z/0;k)  ] ] . 

Interpretation:  The  memory  set  may  be  arranged  in  the  form  of  a left- 

handed  tree,  called  the  memo ry  tree , as  shown  in  Figure  8-1.  An  arrow 
from  z'  to  z indicates  that  z'  <_  z.  The  memory  state  at  any  time  is 
the  element  of  M that  correctly  summarizes  the  largest  number  of  most 
recent  input-output  pairs.  Following  Figure  8-1 , a memory  state  may 


! 
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Time:  H k-2 ■+• k-1 •+• k 


L - (17, 

Y ' -i  2,  i)  , 

- :.e,  (1)  , (2)  , (7)  , (i ) (1'/ , K.)  , (1)  (U  U»  }. 


>te: 


Sinci  the  Ft'o 
an  input-oucui 


: tee,  t.ie  inrut  component  of"”' 
iir  . / *->e  ignored. 


1 


Figure  8-1.  A Memory  Tree 
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be  constructed  by  following  the  tree,  from  right  to  left,  as  far  as 
possible.  The  first  condition  which  M must  satisfy  in  (8.4)  guarantees 
that  a memory  tree  may  be  constructed,  and  hence  that  memory  states  will 
be  well-defined.  The  second  condition  assures  that  memory  states  can 
be  recursively  computed,  as  demonstrated  in  (8.6)  below. 

}*  + 

Example : Z Hz  is  a memory  set.  The  memory  state  induced  by  that 
memory  set,  at  times  ke<iL,°°>,  is  the  string  of  i most  recent  input- 
output  pairs. 

(8.5)  Definition.  The  memory  state  transition  function  induced  by  M 

M 

is  a mapping  T : M x Z -»  M given  by 
M 

T [z,z']  = max [M Pit rune [ zz ' ]] , zeM,  z'eZ  . 

(8.6)  Proposition.  £M(k+l)  = TM[z^(k) , (u(k) ,y (k+1) ) ] . 

M 

Proof : If  z (k+1)  = e^  then  the  result  is  trivial.  Now  assume  that 

M * 

z^  (k+l)^e.  Then  it  follows  that  there  exists  a z' CZ  such  that 

z^(k+l)  = z_'  (u(k),  y(k+l)).  But,  by  condition  (ii)  of  (8.4), 

z^(k+l)  “ max  [MO  trunc  [ z^( 0 ; k+1  ] ] 

< max[  (MZU  fe))ntrunc  [z(0;k+l)  ] ] 


max  [MZlHt  rune  [ z (0 ; k+1)  ] ] 
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= max [MZTitrunc  [ z^(0 ; k)  (u(k)  ,y(k+l))  ] ] 

= z^(k) (u(k) ,y (k+1) ) . 

So  (k+l)cMfltrunc [z^ (k)  (u(k)  ,y (k+1) ) ] , and  hence 

z. (k+1)  <_  raax[MHtrunc  t^M(k)  (u(k)  ,y(k+l))  ] ] 

But 

z^*(k)  z/0;k) 

=>  z,M(k)  (u(k)  ,y(k+l))  <_  £(0;k+l) 

=>  trunc  [z^*(k)  (u(k)  ,y  (k+1) ) ] C trunc  [£M(0;k+l)  ] 

M 

=>  max [MTlt rune [z_  (k)  (u(k)  ,y  (k+1)  ) ] ] <_  max[MOtrunc  [_z (0 ; k+1)  ] ] 

M 

= z (k+1). 

Thus  z^M(k+l)  _<  max  [MHt  rune  UM(k)  (u(k)  ,y(k+l))  ] ] £ zM(k+l) , which 
establishes  the  desired  equality.  t 

Certain  properties  of  memory  sets  are  now  developed  for  use  in 
later  sections. 

(8.7)  Lemma.  (a)  An  intersection  of  memory  sets  is  a memory  set. 

(b)  A concatenation  of  memory  sets  is  a memory  set. 


-70- 


(8.8)  Definition.  If  M is  a finite  subset  of  Z , then  raem[M]  denotes 
the  smallest  memory  set  containing  M,  i.e.  the  intersection  of  all  memory 
sets  containing  M. 

(8.9)  Definition.  The  essential  part  of  memory  set  M is  the  subset: 

ess[M]  = {max[Mntrunc [z] ] : ze(Z+-M)}  C M 

Interpretation : There  are  elements  of  a memory  set  which  may  become 

memory  states  only  during  an  initial  transient  of  bounded  duration. 

£.*  + 

For  example,  in  the  memory  set  Z OZ  , the  memory  state  at  time  k con- 
sists of  the  rain(k,£)  most  recent  input-output  pairs;  if  k > 1,  then 
the  memory  state  consists  of  the  £ most  recent  input-output  pairs;  in 
this  case  ess[Z  Hz  ] = ZnZ  . In  the  memory  tree  interpretation  of 
a memory  set,  a node  in  M is  contained  in  ess[M]  if  it  has  branches 
in  Z that  are  not  contained  in  M. 

(8.10)  Lemma.  If  M is  a memory  set,  then  mera[ess[M]]  = M. 

M 

(8.11)  Lemma.  If  £Eess[M],  then  T [z, z ' Jeess [M] . 

Interpretation : Once  the  memory  state  enters  ess[M],  it  cannot  leave 

it. 
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(8.12)  Definition.  If  M is  a memory  set,  then 

i [M]  = max{2,(z)  : zeM} 
max  ■ — 

i.  [M]  = min{Jl(z)  : zeess[M]} 
min  — 

(8.13)  Lemma.  For  any  control  strategy  y, 

Prob  (z^(k)eess[M] } = 1,  kt<£  [M],°°>. 

y — max 

Interpretation:  The  memory  state  enters  ess[M]  by  the  ^max^Ml_th 

transition. 

The  notion  of  a memory  state  transition  function,  introduced  in 
(8.5),  may  be  extended  to  multiple-step  transitions,  as  follows. 

(8.14)  Definition. 

T^[jz,_z'  ] = max  [MOt  rune  [ z^  z_'  ]] , zeM,  z £*eZ+ 

(8.15)  Lemma 

TM[z,z'z"1  = TM[TM[z,z' ],z"],  ZEM,  z\  z"EZ+. 
Interpretation:  (8.15)  establishes  consistency  of  (8.14)  with  (8.5) 


I 


and  (8.6). 
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9.  Equivalence  and  Augmentation 

This  section  introduces  the  "augmented  system  induced"  by  a memory 
set,  an  FPS  whose  state  consists  of  a delayed  internal  state  and  a 
memory  state.  The  augmented  system  will  be  seen  to  be  "equivalent"  to 
the  original  system,  in  the  sense  that  they  are  indistinguishable  on 
the  basis  of  inputs  and  outputs  alone. 

(9.1)  Definition.  The  input-output  relation  of  an  FPS  is  a mapping 
p : Z ->[0,1]  given  by  p(z)  = tt(0)P(£)1. 

Interpretation:  p(z)  = p((u,^))  is  the  probability  that  output  word 

^ will  be  emitted  initially,  given  that  the  word  of  initial  inputs  was 
u.  The  mapping  p is  a summary  of  all  externally  discernable  charac- 
teristics of  an  FPS. 

(9.2)  Definition.  The  expected  incremental  reward  function  of  a VFPS 
is  a mapping  q : Z+(tr (0) ) x U->  R given  by  q(z^u)  = T(tt(0)  ,z)  q (u)  . 

Interpretation:  q(z^,u)  is  the  expected  incremental  reward  if,  immedi- 

ately following  the  generation  of  input-output  history  z,  input  u is 
selected.  The  mappings  p and  q together  summarize  all  externally  dis- 


» 

\ 


cernable  characteristics  of  a VFPS. 
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(9.3)  Definition.  Two  or  more  FPS's  are  (mutually)  equivalent  If 
their  input-output  relations  coincide.  Two  or  more  VFPS's  are  (mutually) 
equivalent  if  both  their  input-output  relations  and  their  expected  in- 
cremental reward  functions  respectively  coincide. 

The  problem  of  constructing  an  FPS  specification  having  a given 
input-output  relation  is  called  stochastic  realization . Stochastic 
realization  has  been  extensively  studied  by  Paz  (1971).  Picci,  in 
hitherto  unpublished  research,  formulated  the  conjecture  that  almost 
every  FPS  is  equivalent  to  a state-calculable  FPS.  Picci 's  conjecture 
is  disproved  in  Section  18  of  this  report. 

Realization  of  a particular  input-output  relation  generally  entails 
the  incorporation  of  artificial  structure  into  the  model.  The  smaller 
the  number  of  states  used,  the  greater  the  quantity  of  artificial 
structure  incorporated;  consequently  state  calculability  may  be  inhibited. 
This  is  illustrated  below: 

(9. A)  Example.  Consider  a free  state-calculable  FPS  with  U={l}, 

Y = {1,2, 3, A},  N=8,  tt(0)  = e*,  and 


\ 
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This  FPS  is  not  only  state-calculable;  its  state  is  uniquely  determined 
by  the  most  recent  pair  of  outputs.  It  is  equivalent  to  the  4-state 
FPS  having  transition  probability  matrices:  . 
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Equivalence  is  verified  by  using  the  Markov  property  of  consecutive 
output  pairs.  The  second  process,  though  equivalent  to  the  first,  is 
not  state  calculable. 

The  problem  of  computing,  for  a given  FPS,  an  equivalent  system 
having  a minimal  number  of  states  is  (to  the  author's  knowledge)  un- 
solved, and,  in  any  event,  very  intricate.  It  is,  of  course,  possible 
to  eliminate  states  that  are  overtly  redundant  (see  Paz  [1971],  Section 
I.B. 2) ; the  elimination  of  such  redundancy  may  reduce  computation  time 
in  the  algorithms  of  Chapter  IV  in  this  report.  On  the  other  hand,  it 
is  by  increasing  the  number  of  states  that  state-calculability  is  en- 
hanced, and  the  problem  is  eventually  solved.  This  situation  is  notably 
different  from  that  found  in  linear  systems,  where  observability  occurs 
only  when  the  state  space  has  been  reduced  to  a minimal  dimension. 

(9.5)  Definition.  The  augmented  state  set  induced  by  memory  set  M is 
the  set  X [M ] = {[i,zj  : ieS,  zcM,  e1P(z^)l>0}  . The  augmented  state 
induced  by  M at  time  k is  x^(k)  = [s (k-£(z^(k) ) ) , z (k) ] . 

Example:  Memory  set  Z1 *H  Z+  induces  augmented  states  consisting 

of  the  internal  state  delayed  by  £ time  units  and  the  memory  state  of 
£ most  recent  input-output  pairs. 

(9.6)  Proposition . For  any  FPS  along  with  a memory  set  M,  there  is 
a unique  equivalent  FPS  having  internal  state  process  {x  (k)  } . 


Proof : It  is  sufficient  to  show  that  the  augmented  underlying  process 


is  a controlled  Markov  chain.  This  occurs  provided  that  the  sequence 
of  controlled  random  variables  (k-£(z^(k) ) } is  non-decreasing,  a trivial 
consequence  of  (8.6).  + 


(9.7)  Definition.  The  FPS  which  is  equivalent  to  a given  FPS,  and  has 

internal  states  that  are  the  augmented  states  (of  the  given  system) 

induced  by  memory  set  M,  is  called  the  augmentation  (of  that  FPS) 

induced  by  M,  or,  more  informally,  the  augmented  system  induced  by  M. 

A particularly  efficient  representation  of  the  augmented  system 

is  obtained  by  recognizing  that,  although  the  augmented  system  has 

approximately  N-//M  states,  each  of  these  may  effect  a transition  to 

M 

at  most  N*//Z  states.  Specifically,  P (i,j,z  ) may  denote  the  pro- 

M 

bability  that  a transition  to  [ j ,T  (z^z1)]  will  occur,  given  that  the 
system  is  presently  in  augmented  internal  state  [i,£]  and  that  the 
input  component  of  z'  has  been  selected.  It  is  given  by  the  formula: 


. ( 

?"(i.j,z')  = < 


ZkrSPi/(^z,)  Pik<£z,-TMOL.z'))  + i 

kcS-  -1-! & , if  zeZ  (e)| 

ZkeSPik(-) 


(9.8) 


undefined. 


otherwise 


The  transformed  incremental  rewards  are  described  by  arrays: 

i)  , if 


M,.  % 

q?(i,u) 


|T(e1,z)q(u)  , if  zeZ^eSj 

| undefined,  otherwise  j 


(9.9) 


Thus,  the  memory  requirement  to  describe  a particular  augmented  FPS  is 
roughly  It M x [(n2  * #Z)  + (N  x //U)  ] words.  The  fact  that  this  quantity 
grows  linearly  in  #M  is  particularly  significant  as  the  augmented  system 
has  N x #M  states,  and  the  number  of  transition  probability  matrix 
entries  might  normally  be  expected  to  grow  as  the  square  of  the  number 


of  augmented  states. 
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10.  Classification  of  Strategies 

A strategy  was  defined,  in  Section  2d,  as  a rule  for  the  deter- 
mination of  inputs,  specified  by  probability  distributions  for  u(k) 
conditioned  on  each  past  history  [s(0) ,u(0) ,y(l)  ,s(l)  , . . . , s(k-l), 
u(k-l) ,y(k) ,s(k) ] . In  such  a form,  however,  the  description  of  a 
strategy  occupies  an  infinite  tableau,  and  decisions  must  be  made  on 
the  basis  of  infinite  memory.  Such  difficulties  are  avoided  by  intro- 
ducing a class  of  strategies  that  are  totally  specified  by  a finite 
tableau,  called  a policy. 

(10.1)  Definition.  Let  M by  a memory  set.  Then  $ is  a feasible 
strategy  adapted  to  M if  there  is  a policy  <p  • M->  U such  that 

Probt (u(k)  =^(zM(k))}  = 1,  kc<0,°°>  . 

(p  is  then  the  policy  (on  M)  which  realizes  <J).  1>[M]  denotes  the  set  of 

feasible  strategies  adapted  to  M.  A feasible  strategy  that  is  adapted 
to  some  memory  set  is  called  a feasible  adapted  strategy . 

Interpretation : If  <t>c<I> [M ] , then  the  inputs  prescribed  by  cf)  can  be 

determined  by  a finite  memory  controller  whose  memory  set  is  M.  Note 
that  the  input  specified  by  <p  and  that  specified  by  $ need  not 
coincide  in  situations  which  cannot  occur  when  <)>  is  used. 


\ 
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Remark:  There  exist  finite-memory  controllers  that  are  not  adapted  (to 

any  memory  set). 

(10.2)  Def inltion.  Let  M be  a memory  set.  Then  is  a perceptive 
strategy  adapted  to  H if  there  is  a policy  iJj  : X[M]->  U such  that 

Prob^{u(k)=^[xM(k)  ] } = 1,  ke<0,oo> 

<p  is  the  policy  (on  X[M])  which  realizes  t p,  'f' [ M ] denotes  the  set  of 
all  perceptive  policies  adapted  to  M.  A perceptive  strategy  that  is 
adapted  to  some  memory  set  is  called  a perceptive  adapted  strategy . 

Interpretation : If  then  the  inputs  prescribed  by  ^ can  be 

M 

computed  on  the  basis  of  x (k)  alone.  Note  again  that  the  input 
specified  by  and  that  specified  by  4)  need  not  coincide  in  situa- 
tions which  cannot  occur  when  ip  is  used. 

(10.3)  Lemma . (a)  4>[M]c4'lM]. 

(b)  If  MCM',  then  4>[M]C<J>[M]  ’ . 

A (feasible  or  perceptive) adapted  strategy  induces  on  any  FPS  a 
free  system  whose  underlying  process  is  a Markov  chain.  Thus  each 
augmented  state  may  be  characterized  as  transient  or  recurrent , under 
any  particular  adapted  strategy.  The  memory  state,  likewise,  may  be 
given  these  attributes. 


1 
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(10. A)  Definition . Consider  an  adapted  strategy  ip,  along  with  a memory 
state  ztM.  If  there  is  an  ieS  such  that  the  augmented  state  [i,£]  is 
recurrent  under  ip,  then  £ is  recurrent  under  i[/;  otherwise  £ is  tran- 
sient under  <p  . 

The  concept  of  transient  and  recurrent  memory  states  has  the 
following  application:  Suppose  that  some  optimal  (or  e-optimal) 

strategy  has  been  specified,  by  means  of  policy  on  a memory  set  to  which 
that  strategy  is  adapted.  If  the  performance  index  is  average  gain 
over  an  undiscounted  infinite  hbri2t)tt,  then  the  policy  may  be  modified 
in  a number  of  ways  without  affecting  performance.  In  particular,  the 
input  specified  for  any  transient  memory  state  may  be  replaced  by  any 
other  value,  provided  that  it  does  not  cause  that  memory  state  to  be- 
come recurrent.  In  this  manner,  an  optimal  or  subopt imal  strategy 
adapted  to  a smaller  memory  set  might  be  obtained. 


11.  Connectivity 


Graph  properties  of  Markov  chains  have  been  generalized  to  con- 
trolled Markov  chains  by  Platzman  [1977].  These  concepts  are  now 
extended  to  FPS's. 

(11.1)  Definition.  State  i is  connected  to  state  j if  there  exists 
an  input-output  word  zeZ+  such  that  P (z)  > 0. 

Interpretation : If  i is  connected  to  j , then  it  is  possible  for  the 

system  to  travel  from  state  i to  state  j,  provided  that  appropriate 
inputs  are  accepted.  This  does  not  imply  availability  of  reset  inputs 
(which  transfer  the  system  to  a given  state  with  probability  one). 

(11.2)  Definition.  A connected  c lass  C is  a set  of  mutually  con- 
nected states,  none  of  which  is  connected  to  a state  outside  C. 

Clearly  the  state  set  of  any  FPS  contains  at  least  one  connected 
class. 

(11.3)  Definition.  An  FPS  is  connected  if  its  state  set  is  a con- 
nected class. 

(11.4)  Proposition . If  an  FPS  is  connected,  then  there  is  an  integer 
v 1 ,N>  and  a XE[0,1)  such  that,  corresponding  to  any  i.jcS,  an 
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T 


input  word  ucU  exists,  satisfying  1 - [e  o /~\P.  ) < X* 

[2eY  _ ” 

N* 

Remark:  £y  and  X may  be  computed  by  enumeration  on  U , A more 

efficient  algorithm  seeks  a least  costly  path  from  node  i to  node  j, 
where  -log  ly  ^ , (y | u)  is  the  cost  of  a link  from  i'  to  j ' 
labeled  with  input  u. 

In  a connected  FPS , it  is  possible  to  select  inputs  which  allow 
the  system  to  travel  from  any  state  to  any  other,  provided  that  the 
initial  state  is  known.  This  assumption  is  avoided  in  (11.5),  below. 


(11.5)  Definition.  An  FPS  is  reachable  if  there  is  an  integer  £p 

and  a pe[0,l)  such  that,  corresponding  to  every  neR  and  jeS,  an 

l N 

input  word  ueU  p exists  satisfying: 

1 ' P'  £(u)  EicS7TiPij((-,z))l  -P 
ycY  — J 

Interpretation : If  an  FPS  is  reachable,  then  for  any  value  of  the 

information  vector,  there  exists  a sequence  of  inputs,  which  will  drive 
the  state  to  a desired  value  with  probability  1-p  or  more. 


(11.6)  Proposition.  An  FPS  is  reachable  ift  it  is  connected. 

Proof:  Assume  connectivity  and  set  ? =£v ; p=l-4(l-X).  For  any  Trell  , 

PAN  N 

there  is  an  icS  such  that  ti  >_  1/N.  Selection  of  u according  to  (11.4), 
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for  i as  determined  above  and  j as  desired,  satisfies  the  criterion  in 
(11.5).  That  reachability  implies  connectivity  is  trivial. 

Remark : Although  reachability  is  the  property  required  to  establish 

the  existence  of  optimal  strategies  in  FPS  control  problems,  connecti- 
vity is  the  property  that  can  be  decided  algorithmically. 

Reachability  can  be  established  by  inspection  in  some  systems  (e.g.  a 
network  of  finite  queues) , and  the  bounds  thus  obtained  will  be  tighter 
than  those  obtained  through  connectivity  arguments. 

(11.7)  Definition.  An  FPS  is  simply  connected  if  its  state  set  con- 
sists of  a single  connected  class,  along  with  a (possibly  empty)  set 
of  states  which  are  transient  under  all  feasible  strategies. 

(11.8)  Theorem.  Let  C be  the  connected  class  in  the  state  set  of  a 
simply  connected  FPS,  and  let  M be  a memory  set.  Then  the  augmented 
system  induced  by  M is  simply  connected,  having  connected  class 

X [M ] = { [ i , z ] : icC  , zcess [M]OZ+(e1 ) } C X[M] 

Proof : Augmented  states  of  the  form  [i,z_]  with  icS-C  are  clearly 

transient.  Those  of  the  form  [ i , z ] with  zcM-ess[M]  cannot  occur  after 

the  l [M]-th  transition,  by  (8.12).  To  show  that  [i,z]  and 
max  — 

[i',z']eX[M]  are  connected,  select  jeC  so  that  P^(z)  > 0 and  z cZ 


4 
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so  that  Pjit(z.")  > 0,  the  existence  of  the  latter  being  guaranteed  by 
(11.1).  Then  the  augmented  system  may  travel  from  state  [i,z_]  to  state 
[i',z/]  when  the  intervening  input-output  word  is  z"z_' . + 

An  algorithm  which  decides  whether  a given  state-observable  FPS 
is  simply  connected  was  introduced  by  Platzman  [1977] . Simple 
connectivity  of  the  underlying  process  is  not  necessarily  implied  by 
simple  connectivity  of  the  FPS,  as  is  illustrated  below: 


(11.9)  Example.  Let  U={l,2},  S*{l,2,3},  Y={l},  tt(0)  = (1/2, 1/2,0)  and 


' 1/2 

1/2 

0 ' 

■1/3 

1/3 

1/3  ' 

P(ljl)  = 

1/3 

1/3 

1/3 

, P ( 1 1 2)  = 

1/2 

1/2 

0 

. 0 

0 

1 - 

. 0 

0 

1 - 

The  single  connected  class  is  {3);  states  1 and  2 are  transient  under 
all  feasible  strategies.  Yet  there  exists  a perceptive  strategy  under 
which  states  1 and  2 form  a recurrent  class:  this  is  the  strategy 

u(k)  = s(k). 

The  following  algorithm  will  (in  principle)  determine  whether  or 
not  a given  FPS  is  simple  connected.  It  does  so  by  seeking  to  discover 
a strategy  under  which  the  state  will  never  enter  the  connected  class. 


(11.10)  Algorithm.  Let  C denote  the  unique  connected  class  in  the 
state  set  of  a given  FPS.  Label  each  nonempty  subset  H of  S-C  with 
a binary  digit  denoted  c(H);  initially  c(H)=0,  for  all  HCS-C.  Then 


perform  the  following  step,  for  every  HCS-C,  until  the  c(*)  remain 
invariant:  set  c(H)=l  if,  for  every  ueU,  either 

EiGHEycYEjeCPij(ylu)  ' 0 
or 

£ y c({j  : pij (y iu>  > o,  ieH»  > 0 

Then  the  FPS  is  simply  connected  iff  c(H)-*-l,  for  all  nonempty  subsets 
K of  S-C. 

(11.11)  Proposition.  If  an  FPS  is  simply  connected,  then  there  is  an 
//(S-C) 

integer  i <_  2 such  that  the  augmented  system  induced  by  M has  a 

simply  connected  underlying  process  whenever  £ . [M]  < £. 

min  — 

Proof : Define  H(k)  = (i  : n^(k)  > 0}  and  assume  that  H(0)C  S-C.  Then 

(11.10)  implies  the  following:  for  any  given  values  of  H(k-l)  and 

u(k-l),  either  H(k)  may  contain  elements  in  C,  or  there  is  a y(k)  such 

that  H(k)  will  be  distinct  from  H(0)...  H(k-l).  But  there  are  2^S  C^-l 

//(S-C) 

nonempty  subsets  of  S-C,  so  H(2  ) may  contain  elements  in  C,  i.e. 

Prob/H^^  C))nC  is  nonempty}  > 0 under  any  feasible  strategy.  Thus, 
internal  states  lying  outside  C are  transient  under  any  strategy  adapted 

///s-r> 

to  M,  provided  that  £ , [M]  > 2 t 

r min  — 

When  S-C  is  a large  set,  the  enumeration  of  subsets  of  S-C  is 


computationally  infeasible.  A sufficient  condition  for  simple 
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connectivity  is  now  derived. 

(11.12)  Lemma.  If,  in  the  outcome  of  Algorithm  (11.10),  c(A)=l  and 
EDA, then  c(B)=l. 

(11.13)  Theorem.  An  FPS  is  simply  connected  if  its  underlying  process 
is  simply  connected. 

Proof:  Simple  connectivity  of  the  underlying  process  implies  c({i))=l, 

V ieS-C.  Hence,  by  (11.12),  c(H)=l  for  all  nonempty  subsets  H of  S-C. 
In  (11.10),  this  is  the  sufficient  condition  for  simple  connectivity. 


t 


12.  Metrics 


This  section  introduces  metrics  that  are  used  to  measure  the  "close- 
ness" of  approximations  to  the  information  vector.  The  continuity  of 
convex  functions  with  respect  to  these  metrics  is  then  established. 

a.  Definition  of  the  Metrics 

(12.1)  Definition.  Consider  Tell^,  wcRn  with  w>0  and  itw>0.  Then  now 

is  a vector  in  II  having  entries: 

N 

ir . w 

(ttow)  . = — - 

1 TTW 


Interpretation : This  is  merely  Bayes’  operator.  For  example,  tt  might 

represent  a priori  probabilities  of  some  random  variable,  s,  on  sample 
space  S.  Given  an  event  occurring  with  conditional  probability  w.  pro- 
vided that  i is  the  true  value  of  s,  then  ttow  is  the  vector  of  a 
posteriori  probabilities  of  random  variable  s. 


(12. 

.2) 

Definition ■ For  tt.tt'gII^, 

define 

(a) 

6[- 

' 1 = V ( -T  _ ; * \ + . 

’ 1 it'S  ' i r ’ 

(b) 

A[  r 

, ']  = sup  {6  [now,  tt'ow]  : 

wtR  , w>0} ; 

(c) 

D[ 

= 1 - min[firi/-n  : ■: 

' >0,  ies}  n { 
i 

Remark : An  interpretation  of  these  functions  is  given  in  section  12b, 
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following  the  derivation  of  certain  fundamental  properties. 

(12.3)  Lemma.  (a)  0 _<_  6 [tt  ,tt  * ] = 1 / 2 | tt— tt  ’ | <_  A[tt,tt']  < 1; 

(b)  0 _<  D [tt  ,tt  ’ ] £ 1 . 

(12.4)  Lemma.  A(now,  r’ow]  <_  A[ir,TT' ] , VTr.Tr'eJI^,  we.R^,  w>0,  ttw>0,  tt'w>0. 

(12.5)  Proposition.  6, A,  and  D are  metrics  on  II  . 

Proof : A metric  satisfies 

(i)  f [tt.tt'  ] >_  0, 

(ii)  f [tt , tt ’ ] = 0 <=>  it  = it' , 

(iii)  f [tt.tt’  ] = f [ tt  ’ , n ] , 

(iv)  f [tt.tt'  ] +f  [tt'  ,it"]  _>  f [tt.tt"]  . 

(a)  Since  |*|  is  norm  on  R^,  it  defines  a metric  | tt— tt ’ | on  11^.  By 

(12. 3)  (a)  , 6 [ • , • ] is  a metric  on  II  . 

N * 

(b)  Parts  (i)  and  (ii)  are  trivial. 

(iii)  A [tt  , tt  ' ] = sup  { 6 ( ttow  , tt  1 ow  ] : weR^,  w>0} 

= sup{6[TT'ow,  ttow]  : weR^,  w>0) 

= A [tt  * ,ir] . 

(iv)  A[tt,tt'  ] + A [TT ' ,tt"] 

sup{6[ttow,tt'ow]  + 6[tt'ow,  tt"ow]  : weR^,  w>0) 

>_  sup{6 [ttow ,tt"ow]  : weR^,  w>0} 

- A[tt,tt"]. 


\ 


(c)  Parts  (i)  and  (ii)  are  trivial. 


(iii)  D ( tt, 7T ' ] = D[7t',7t]  by  symmetry. 

(iv)  For  71, n'  ,TT"eiI ,,  assume  with  no  loss  of  generality 

N 

that  tt"  > 0 and  D[tt,tt"]  = 1 - If  tt|  = 0, 

then  D[tt'  ,ti"]  = 1 and  D [ tt , tt ' ] + D [tt ' ,ttm]  >.  1 >_  D [tt,tt"]  . 

If  TTj  > 0,  then  (iTj/np  = (tt1/tt^)  (tt^/tt”)  and  (1-D[tt,tt'  ] ) 
(1-D[n’ ,n"])  <_  1 - D [77 , 7t" ] , implying  D[tt,tt"]  £ D [tt ,tt ’ ] 

+ D[Tt',7T"]  - D [ TT  , TT  1 ) • D [ TT  ' , 77”  ] < D [TT  , TT ' ] + D[7r',7T"].  f 


(12.6)  Theorem.  (Evaluation  of  A).  For  Tr.TT'ell^,  define: 


Then 


c = min{TT|/iTi  : tt^  > 0}  , 

c„  = min{TT  /tt!  : tt.  > 0}  . 
2 i 1 i 


A [ tt  , tt  * ] 


1 - 


1 + /c^c2 


Proof:  If  {i  : tt  > 0}  7*  { i : tt|  > 0}  then  A[tt,tt']  = 1.  To  see  this, 

assume  without  loss  of  generality  that  there  is  an  i£S  such  that  tt^  > 0 
and  tt!  = 0.  Then  {w™}  = {(--)!  + (1 — )e*}  is  a sequence  in  Fh  for 

1 mm  N 

which  lim  . 6[iTOwm,  Vow01]  ■ 1,  since  (ttow™)  -*■  1 and  (iT'owm)  = 0. 
m * 00  l 1 

By  (12.3) (a),  the  sequence  {w™}  is  supremal. 

It  follows  from  (12.5)  that  A[tt,tt]  » 0.  The  case  tt^O  <—>  t/>0, 

77  7*  tt'  remains.  By  (12.5),  A [ tt  , tt ’ ] > 0.  Assume  without  loss  of  gener- 
ality that  tt  > 0 and  tt’  > 0.  Clearly  0 < c]L  < 1 and  0 < c2  < 1;  hence 
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0 < c1  < 


< 00  . 


Define : 


A,  [it, u']  = sup{ 6 [ttow , tt ' ow ] : weR^,  w>0,  tt'w/ttw  = O 


= sup 


l^leS  (vi  - ~p)  + : “ERN  • “ i 


TTW  = 1 , IT ' W = £ I 


which  exists  for  all  c^  <_  £ . Clearly 


A[7T,7r']  = maxtA^ fn'>TV' ] : £ t,  _<  c^} 


Now  A.  [it, it']  may  be  expressed  as  the  solution  of  a linear  program 


AJtt.tt']  = 


max : 

subject  to: 


aw 

ttw  = 1 
tt'w  = 1 
w > 0 


where 


ai  = (\  " *i)+’ 


tt!/C 


Any  optimal  basic  w that  solves  this  linear  program  has  at  most  two 
non-zero  entries;  let  these  be  denoted  (i,j).  Then 


\ 
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max : 

a ,w. 

+ a,w, 

i i 

j i 

subject  to: 

w > 
i - 

0,  w 

TT  .W. 

+ TT  ,W, 

i i 

j j 

tt'w. 

+ tt'w, 

i i 

3 j 

Assume  without  loss  of  generality  that 

(i,j)  e A = { (i , j ) : < (tt^/tt^)}. 

Now  a^  > 0 and  a.  = 0;  for  otherwise  one  of  the  following  must  hold: 

(i)  at  = 0,  a = 0 — > [tt , tt ' ] = 0 

(ii)  ai  > 0,  a.  > 0 > Af[TT,7i']  = a^  + a^w^  = (ir^  - tt^^ 

+ <"j  - 5j,wj  ' 1 - 1 ' °- 

(ill)  a.  - 0,  a > 0 ->  (l,j)<A. 

Hence  C,  must  be  such  that  (tt * /tti>  < C £ (tt  ' j /Tt^ ) . The  basic  feasible 
solution  with  indices  (i,j)  is  now  seen  to  take  the  form: 


~ 71  j . . 

Vj  - Vi 


> o 


* \ * \ 


Vj  ‘ Vi 


\ 


J 
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and  the 


How 


Since  A 

* 

5 = G = 


corresponding  expression  for  A.  [it, it']  is 


■ *w*  ' Vi* 


(TTi  ~ V ~ V 
Vj  - 


(^j  ~ (TT^  - gTT^) 

£ * (V*  - TT  Ttp 


TT^  4-  TTiTT|  - £7^  - C ^TT^ 

Vi  - Vi 


AtTT.TT1]  = max)  A^-  [tt,7T  1 ] : c1  £ C £ c^f 


max 


= max. 


^.i,jtir,7r']  : (1,:})eA*  ^7ri./TTi)  - ^ i (Trj/7Tj) 

L(i.j)eA  | max(^/TT1)<?£(Tij/TTj)  jA£,i,j  [7T,TT  41 

r ^ j [ 7T , 7T * ] is  concave  in  it  achieves  a unique  maximum  at 


/ 


I t 

TTiTT.j 


Vj 


. Thus 


\ 


\ 

\ 
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b.  Discussion 

The  metric  6 , also  known  as  the  Hajnal  measure,  has  many  applica- 
tions in  the  theory  of  ergodic  Markov  chains;  see  Paz  [1971].  Informally, 
6[tt,tt']  is  the  (minimal)  "quantity"  of  probability  that  would  have  to 
be  "reassigned"  in  order  to  transform  probability  distribution  tf  into 
probability  distribution  tt ’ . Similarly,  A[tt,tt']  is  the  minimal 
quantity  of  conditional  probability  by  which  tt  and  tt'  might  differ 
if  they  were  supplemented  by  identical  observations  (in  the  sense  of 
the  interpretation  following  (12.1)).  Consequently  two  information 
vectors  that  are  very  close  in  the  sense  of  6 may  be  far  apart  in  the 
sense  of  A.  This  occurs  because  subsequent  observations  might  cause 
the  two  information  vectors  (representing  similar  a priori  assumptions) 
o be  transformed  into  radically  different  conclusions. 

(12.7)  Example . Consider  an  FPS  in  which  it(0)  = (1-e,  c)  , G<<1, 
but  it  is  desired  to  approximate  tt ( 0)  by  e^  = (1,0).  In  a 6 sense, 
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tt ( 0)  is  "near"  the  approximation  e^  ; this  indicates  that  the  uncondi- 
tional expectation  of  a function  of  the  initial  state  will  not  be  signi- 
ficantly affected  by  this  approximation.  Suppose,  however,  that  every 
input-output  pair  which  subsequently  evolves  corresponds  to  transition 


probabilities 


Given  a sufficient  number  of  input-output 


pairs  of  this  form,  the  conditional  initial  state  probability  vector 

_ 1 

tends  to  (0,  1);  yet  if  the  approximation  ir(0)  = e is  used,  then 

the  conditional  initial  state  probability  vector  will  remain  e^  . Thus 

an  initial  error,  of  6-sense  magnitude  E<<1,  may  lead  to  an  eventual 

error  of  6-sense  magnitude  arbitrarily  close  to  1. 

The  distinction  between  6 and  A is  also  illuminated  by  an 

examination  of  the  topologies  they  induce  on  il^  : the  topology  induced 

by  6 is  continuous,  but  A causes  IL,  to  be  separated  into  faces  of 

N 

the  form  11^(11)  = ' ^i  > ® <==>  ieH}  . These  are  exactly  the 

subsets  on  which  a convex  function  over  11^  is  guaranteed  to  be  con- 
tinuous (with  respect  to  the  Euclidean  metric;  see  Rockafellar  [1970], 
Chapter  10) . 


c.  Some  Properties  of  Metric  D 

Metric  D is  introduced  mainly  for  the  purpose  of  making  continuity 
of  convex  functions  more  explicit. 


A[TT,1T'  ] < D [ TT  , 7T  ’ ] < 4A  [ TT  , 7T  * ] . 


V 


(12.8)  Proposition . 


JAN  77  LK  PLATZMAN  AF-AF0SA-2273-72 


UNCLASSIFIED  ESL-N-723  AFOSft-TR-77-OMl  NL 


-95- 


Proof:  Let  ci>c2  38  in  (12.6),  so 


Mir.n'] 


1 ~ /V2 
1 + /V2 


D[tt,tt']  = 1 - min(c1,c2) 

If  ■ 0 or  c2  ■ 0 
and  c2  i 0 , then  (i 
the  entries  of  tt  and 

A[tt ,tt*  ] < 1 

and 

D[tt,tt']  < 1 

- Mtousn < 4A[tt,tt'  ] . t 

1 + 2A[tt,tt'  ] + AZ[7r,ir'  ] 


, then  the  result  is  trivial.  However,  if  f 0 
: tt^  4 0}  * (i  : tt^  ^ 0}  and  c1»c2  — 1»  since 
tt  1 (respectively)  sum  to  one.  Now: 


c1c2  < 1 - min(c^ , c2)  - D[TT,Tr'] 


C1C2 


/I  - A[tt,tt]  ) 
' 1 + A [TT  , 7T  * ] ' 


(12.9)  Lemma.  Suppose  TT.ir'ell^  . Then  d e[0,  1]  satisfies  D[tt,tt*  ] £d 
if  3 TT,Ven^  such  that: 

tt'  - (l-d)TT  + dir 
TT  - (l-d)TT'  + dTT ' 


Proof:  If  d * 0,  the  proof  is  trivial.  Assume  d > 0 and  let 

TT  - [TT’  - (l-d)TT]/d,  TT'  - [TT  - (l-d)TT1  ] /d . Clearly  | tt | - |tt' | - 1.  But 

d > D [TT , TT  * ] <=>  1-d  <_  (tt^/tt^),  VieS  <=>  tt  ^ 0 and  similarly  tt  * 0. 
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Thus  Tr.ir'el^  <— > d ] . 

(12.10)  Corollary.  Let  it  - A^n(i) , ir'  * X^Tr'(i),  X^, 

Tr(i),  ir'CDel^  and  ^ X±  - E*^  A'  - 1. 

Then: 

< sup1  ±t  D[Tt(l),  it’  (i)  ] 

A 

Proof:  Let  d * sup.  D[tt(1),  tt  * ( 1)  ] and  construct  TT(i,j), 

1*1 

as  in  (12.9)  so  that: 

Tr’(j)  - (l-d)n(i)  + dw(i,j), 

7T(i)  - (l-d)TT'(j)  + dWi.j). 

Then  ir  ■ E.^  Ej_^  Xj,X'^iT(i,j)  and  n'  = E^_^  X^Xjir^i.j) 

tt'  - (l-d)TT  + du 

n « (l-d)ir'  + diT' 

and,  by  (12.9),  D[h,tt']  £ d. 


d.  Continuity  of  Convex  Functions 

(12.11)  Definition.  (a)  V is  the  vector  space  of  bounded  real- 
valued continuous  functions  on  II  • 


(b) 


is  the  "sup  norm," 


- 8UPttGiin  lv(1T)l  • 


+ 

x;  > o, 


TT'd.jJel^ 


satisfy: 


4 
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(12.12)  Definition.  For  any  veV,  veV  denotes  the  White  projection  of 
v,  given  by 

v(iO  = v(tt)  - v(eN) 

Remark:  This  projection  generalizes  a normalizing  operation  devised  by 

D.J.  White  [1963],  for  value  functions  having  finite  domain,  to  avoid 
divergence  in  value  iteration. 

(12.13)  Definition. 

IMID  ’ [sUfVeIlN  v(n)1  ' UnfTTtIIN  v(n)] 


Interpretation:  ||  *||D  is  a norm  on  the  subset  V of  V,  where 

V » : veV}  « {veV  : v(e**)  * 0). 

(12.14)  Lemma.  ||  v||  < (|  v||D  = ||  v||D  < 2 f|  v||  . 

(12.15)  Theorem.  If  veV  is  convex,  then 

|v(ir)  - v(TT')|  < D f 7T , 7r * ] ||  vJJD  , Vir.ir’eJ^  . 

Proof : Assume  without  loss  of  generality  that  v(n)  ^v(tt').  Following 

(12.9),  construct  rr ' so  that  tt  = (1  - D[tt,tt'  ] ) tt * + D [tt ,tt’  ]tt*  . Then 
v(tt)  - v(tt')  < (1  - D[x,tt'  ])v(tt')  + D [tt ,tt ' ]v(ir')  - v(tt')  - D [tt ,tt r ] 


[v(TT')  - V(TT')]  < D[TT  ,TTf  ] ||  V 1 1 D . 


+ 
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(12.16)  Theorem.  For  every  convex  function  veV,  there  is  a quantity 

II  VIIA  i 4H  VHD  8uch  that 

|V(TT)  - V(TT')  | < A[TT,TT']  ||  v||  A , VTT  ,TT ' £ll^  . 


Proof:  Trivial,  by  (12.8)  and  (12.15). 
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13.  Contraction  Properities  of  T 


If  P Is  a stochastic  matrix,  and 

a[P]  = maxi  ^ 6[rowi[P],  roWj[P]]  < 1, 
then,  for  any  , 

6 [ttP,  tt'P]  < fit [P ] 6 [tt ,tt * ] , 

i.e.  the  transformation  f [n]  •=  ttP  is  a contraction  mapping  in  . 

One  consequence  of  this  property  is  that  {ttP  } approaches  a unique  limit 
as  k-*-00  ; this  is,  of  course,  the  vector  of  steady-state  probabilities 
for  a Markov  chain  having  transition  probability  matrix  P. 

This  section  generalizes  the  concept  of  contractions  in  state  pro- 
bability vectors  to  the  information  vector  transition  function  T [defined 
by  (2.8)  and  (7.7)]. 


(13.1)  Definition.  An  NxN  substochastic  matrix  P is  said  to  be  subrec- 
tangular  if,  for  every  i,j,i',j'eS, 

?±j  > 0 and  Pitjl  > 0 . 

~=>  Pijt  >0  and  Plfj  > 0 
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(13.2)  Definition.  If  P Is  a substochastic  matrix  and  P ^ 0,  then 

(a)  ot[P]  - max(A[row1[P]/(row1[P]l) , row^ [P]/(roWj [P] 1) ] : 

row1[P]  ^ 0,  row^P]  + 0}. 

Also  3[zJ  denotes  ot[P(z)]. 

(b)  a[P]  - max{D[rowi[P]/(row1tP]l) , row^ [P]/row^ [P ] 1) ] : 

row^P]  4 0,  roWj  [P]  + 0}. 

Also  a[zj  denotes  a[P(z)]. 

3 

Remark:  The  evaluation  of  a[P]  or  a[P]  by  enumeration  requires  N 

operations.  This  is  comparable  to  the  effort  expended  in  multiplying  two 
NxN  matrices. 


(13.3)  Proposition.  (a)  0 < a[P]  < 1 and  0 _<  a[P]  < 1 for  all 

substochastic  matrices  P J 0. 

(b)  o[P]  < 1 <=>  a[P]  < 1 <=*=>  P is  subrectan- 
gular. 

(c)  a[P]  “ 0 <=*=>  a[P]  = 0 <=>  P has  rank  1. 

The  following  lemma  states  a well-known  property  of  the  Hajnal  measure. 

(13.4)  Lemma.  If  weR^  , and  TT.u'ell  , then 


ieS 


w ] — 
iJ 


[min 


icS 


wj ) 


|irw  - tt'w | _<  6[Tr,n’]{[max 
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Proof : Assume  without  loss  of  generality  that  irw  - tt'w  > 0.  Now 


TTW  - TT’W  =*  ^leS(TT1  " 


- W"i  - WV  wi  + EieS(7rl  " 7r'l)"wi 
1 EieS(lTi  " "V*  tmaxles  wi]  + EieS(7Tl  ' "V 


[minieS  wl] 


6[ir,TT'][maxieS  wj  - 6[ir,Tr’ 1 [minieS  w^  . t 


Remark:  (13.4)  may  be  viewed  as  a stronger  version  of  (12.15),  where 

v Is  constrained  to  be  linear. 

Using  (13.4),  It  is  possible  to  demonstrate  (13.5). 


(13.5)  Theorem.  (Contraction  property  of  T) . If  and 

zeZ+(n,n')  then 

A[T(n,z),  T(n',z)]  < a[z]  A[n,n ' ] • 


Proof : Construct  row  vectors  {tt^}  having  elements 


Plj(£>/ZJ.eSpljl(£)  • if  iEl<z) 


0, 


otherwise 


and  define: 


W = (weR^  : w ^ 0,  nP(z)w  > 0,  n'P(z)w  >0} 

W = (weR^  : w _>  0,  nw  > 0,  n'w  > 0} 

I(^,w)  = (i  : row^ [P(z^) ]w>  0} 

A 

Since  1,  the  N-vector  of  one's,  is  an  element  of  each,  W and  W are  non- 
empty. Also,  if  zEZ+(n,n')  as  required  above,  and  weW,  then  I(jz,w)  is 

i i ' 

nonempty.  Finally  a(z)  = max^  ]}  by  (13.2)  (a).  Now 


A[T(n,z),  T(n',z)] 

/ I 


1 SUPveW  )ZjES 


ieI(z,w)niPij  (-)wj 
r|P(z)w 


Jiel(z,w)  'irij 


n’P^CzJWj  \ +| 


n'P(z)w 


supwEWmaxJiS  EjeJ 


nP(z)w 


EiEl(z,w)nIPij(-)Wj 

n’P(z)w 


supweWmaxJCS  ) iel(z,w) 


nP(z)w 


V£1CSP11(^“, 

n'p(z)v 


8upweWmaxJCS  )EieI(z,w) 


( 


rieSniPH(£)w1 
n p(z)w 


E1eSnlPH(^)w1 

n'P(z)w 


ElcJPn(-)wi 

EZj£SPii(^"j 
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Application  of  (13.4)  now  yields 


A[T(n,  z),  T(n',  z)] 


< supwewlnaxJCS 


Jiel(z ,w) 


^1cSnlP11 (->w1 
nP(z)w 


£iesri;pn^>“i 

n’P(z)w 


max  ( £i£J_Vl  _ 

i.i’«M\  /w 


EleJ  <WJ 
/’w 


supweW  j ^iel(z) 


Vi 


nw 


Vi. 

n’w 


supwEWmaxi,i'eI(z,w)j  jeS 


< A[n,n']  • a[z]. 


where  the  last  inequality  follows  from  (12.4). 


t 
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(13.6)  Corollary.  ot[z  z'  ] <_  a[£]ct[£'  ] . 

Proof:  By  (13.2),  a[£  z'  ] = max1  itf_I(z  ^^fAIKe^z  z'),  Ke1  , zz)]} 

But,  following  (13.5), 

A[T(ei,  ££,),T(ei  , z z' ) ] 

**  A[T(T(ei,  z),  z'),  KKe1',  z ) , z')] 

< atz’lAlKe1,  z),  T(ei',  z ) ] 

_<  a[£' ]a[£]A[ei,ei  ] 

= ot[jz'  ]a[£] . * 

The  corresponding  result  for  a[£]  is  considerably  weaker. 

(13.7)  Proposition.  For  n.o'eH^  , zcZ  (n»n')>  D[T(tt,£),  T(tt',£)]  <^a[£]. 

TTi(e1P(z)l) 

Proof:  T(tt,z)  = EleS  A^e  ,z)  where  \±  = — • (12,10) 

completes  the  proof. 

Remark:  This  is  not  a contraction. 

(13.8)  Corollary.  a[z  ] ^_  a[z' ] . 
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14.  Detectability 

a.  Preview 

The  intuitive  notion  of  detectability  was  introduced  in  Section  5d; 

essentially,  a detectable  FPS  has  the  property  that  the  information 

vector  is  arbitrarily  closely  approximated  on  the  basis  of  the  memory 

state  alone,  if  the  memory  set  is  sufficiently  large.  The  extent  to 

which  an  information  vector  depends  on  input-output  pairs  not  contained 

M 

in  the  memory  state  is  given  by  a[^  (k) ] , the  contraction  induced  on 
the  information  vector  by  the  input-output  pairs  contained  in  the 
memory  state.  Recall  that  by  (13.3) (b),  a[£M(k)]  < 1 iff  P(£M(k))  is 
subrectangular. 

Four  types  of  detectability  will  be  defined;  these  are: 

(i)  strong  subrectangular ity  (SSR) , a condition  under  which  every 
transition  probability  matrix  is  subrectangular. 

(ii)  weak  subrectangular ity  (WSR) , a condition  under  which  every 
transition  has  positive  probability  of  generating  an  input- 
output  pair  to  which  a subrectangular  transition  probability 
matrix  corresponds. 

(iii)  strong  detectability  (SDT) , a condition  under  which  there  exists 
a memory  set  whose  essential  elements  each  correspond  to  sub- 
rectangular transition  probability  matrices. 

(iv)  weak  detectability  (WDT) , a condition  under  which  the  memory 

state  at  any  given  time  has  positive  probability  of  corresponding 
to  a subrectangular  transition  probability  matrix. 


\ 


A 
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These  definitions  differ  in  the  type  of  approximation  closeness  implied, 
and  in  the  complexity  of  procedures  which  establish  this  closeness. 

The  following  implications  are  trivially  verified: 


SSR 


WSR 


SDT 


WDT 


Each  type  of  detectability  will  be  investigated  in  turn.  It  will 
be  shown,  for  each,  that  a f inite-ffleiflory  E -optimal  observer  may  be  con- 
structed, and  how  the  estimation  error  and  memory  size  interrelate. 

b.  Strong  Subrectangularity 

(14.1)  Definition.  An  FPS  satisfies  the  condition  of  strong  sub- 
rectangularity  (SSR)  if  P(z)  is  subrectangular , VzeZ. 

(14.2)  Definition.  For  an  FPS  satisfying  SSR,  define 

a = maxz£;Z{a[z] } 

T = (-logoi)  / (log//Z) 

Remark:  The  logarithms  may  be  taken  to  any  desired  base. 


Remark:  By  (14.1),  SSR  =>  ot<l. 


( 
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Remark:  The  definitions  of  T given  here  and  later  in  this  section 

are  consistent  with  (1.2). 

(14.3)  Proposition.  If  an  FPS  satisfies  SSR  then,  for  any  me<0,°°>, 
ke<0,m> 

a[j!(k-m;k)  ] <_  a10 

Proof:  By  (13.6),  a[z/k-m;  k)  ] < a[_z(k-m;  k+l-m)  ] a[z(k+l-si;  k+2-m)  ] ... 

a[z(k-l;  k)]  < om  + 

(14.4)  Theorem.  Consider  an  FPS  satisfying  SSR,  along  with  the  memory 
set  M = {Zm*n7.+  }.  Let  it  : M -*■  11^  be  a mapping  satisfying: 

!tt(z)p(z)  4 o,  zezlrnz+  | 
tt(z)  = 7T  (0) , zeZ(m_1)  nz+j 

Define  n(z)  = T(tt(£),£).  Then 

A[n(k),  n(zM(k))]  < a®,  ke<0,°°> 

Proof:  If  k<m  , then  n(k)  * n(zM(k)).  But  if  k > m,  then 

M M 

z (k)  - z(k-m,  k)).  But  if  k _>  m,  then  z (k)  - z(k-m,  k)  and,  by  (14.3) 


\ 


and  (13.5), 
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A[n(k),o(zM(k))] 

£ A[T(n(k-m) , £(k-m;  k))  , T(ir(zM(k)  ) , z/k-m;  k) ) ] 

£A[n(k-m),  •rr(zM(k))]am 

£ a®  t 

Interpretation:  There  is  a finite-memory  observer  requiring  no  more 

than  (//Z)m  essential  memory  states  which  generates  estimates  of  the 
information  vector  lying  within  am  of  its  true  value  (in  a A sense; 
(12.3) (a)  determines  6 and  | • | -sense  bounds  on  this  error). 

Generalization:  The  approximate  relationship  between  essential  memory 

m and  maximum  error  e is: 

-T 

e = m 

m = f-l/T  (14.5) 

However,  the  strict  bounds  are: 

E £ (m///Z)  T 

m £ (e/a)  (14.6) 

“1  /t 

Specifically,  this  means  that  no  more  than  (e/a)  essential  memory 

states  are  required  to  maintain  a maximum  error  less  than  e,  and  that 

m essential  memory  states  can  achieve  an  error  bounded  above  by  (m///Z)  . 
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c.  Weak  Subrectangularity 

(14.7)  Definition.  An  FPS  satisfies  the  condition  of  weak  subrec- 
tangularity (WSR)  if,  for  every  ieS,  u eU,  there  is  a yeY  such  that 
P(y|u)  is  subrectangular  and  e^PCylu)  J 0. 

(14.8)  Definition.  For  a FPS  satisfying  WSR,  define 

a = maxlESmaxueU  ZyfY  Ejeg  P±j (y |u)a[ (u,y) ] 
r = (-loga)/(log#Z) 

Remark:  By  (14.7),  WSR  =*=>  a < 1. 

(14.9)  Proposition.  If  an  FPS  satisfies  WSR,  then  for  any  me<0,°°> 
ke<0,m>,  TreJI^  , and  any  strategy  y 

EYUn(k),T(TT,z(k-m;k))]}  < a”  . 

Proof : (By  induction)  If  m=0  the  result  is  trivial.  But 

E {a[£(k-m;  k) ] } 

» E^{a[z(k-ni;  k-1) ] • E {a[£(k-l;  k)]|£(k-m;  k-1)}} 

- E^falz/k-m,  k-1)]  • EY(a[£(k-l;  k) ] 

|z^k-m;  k-1),  s(k-l),  u(k-l)} 
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* Ey{ot[£(k-m;  k-1)  ] • E^{a[z^(k-1;  k)  ] 

| £(k-m;  k-1),  s(k-l),  u(k-l)} 

= Ey{a[z/k-m,  k-1)] 

* {Ey eY  EjeS  P8(k-1)j(ylU<k_1))Ct[(u(k_1),y),}} 

<_  Ey{a[z/k-m;  k-1)]  • a} 

= a • E^iatz/k-On-l)  ; k)  ] } t 

(14.10)  Theorem.  Consider  an  FPS  satisfying  WSR,  along  with  the 
M*  +, 

memory  set  M = (Z  Oz  } . Let  tt  : M -*■  IL^  be  a mapping  satisfying: 

( (z)P(z)  4 0,  zeZMnz+  ] 

|tt(z)  = tt(0),  zeZ(m-1)*nz+ ) 

Define  n(z)  « T(tt(z),  z) , then  for  any  strategy  y, 

Ey(A[n(k),  n(zm(k))]l  < am‘ 

Proof : If  k<m,  then  z^kJeM1111  E , and  n (k)  « n(k).  But  if  k>tn,  then 

JUz®)  ■ m,  and,  using  (13.6)  and  (14.3),  Ey(A[n(k),  n(zm(k))]} 

- Ey{A[T(n(k-m),  zm(k)),  T(*(z“(k)),  z“(k))]}  < Ey(a[zm(k)]  <a*. 

t 

\ 
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Interpretation:  There  Is  a finite-memory  observer  requiring  ( //Z ) ™ 

essential  memory  states  which  generates  estimates  of  the  information 
vector  lying  on  the  average  within  a m of  its  true  value  (in  a A sense). 

Generalization : The  approximate  relationship  between  essential  memory 

m and  mean  error  e is: 

e « m T 

~ — -1/x 

m - e ' (14.11) 

However,  the  strict  bounds  are 
7 <_  (m///Z)~T 

m < (e/a)~1/T  (14.12) 

Specifically,  this  means  that  no  more  than  (x/a)  ' essential  memory 

states  are  required  to  maintain  a mean  error  below  e,  and  that  m 

essential  memory  states  achieve  a mean  error  bounded  above  by 
(m/C/Z)'T. 

d.  Strong  Detectability 

(14.13)  Definition.  An  FPS  satisfies  the  condition  of  strong  detec- 

A 

ability  (SDT)  if  there  exists  an  integer  i such  that  P(z)  is  subrec- 

, J- 

tangular,  vzeZ  HZ  . 


-112- 


(14.14)  Definition.  For  an  FPS  satisfying  SDT,  define 

ql  «*  max  . (a(z]} 

zeZ  Z 

l = min{k  : a^l) 

a » aj 

t = (-log  a)/(£log  ItZ) 

Remark:  By  (14.13),  SDT  =*=>  a<l. 

Remark:  If  an  FPS  satisfies  SDT,  then  definitions  (14.2)  and  (14.14) 

A 

are  consistent,  since  £=1. 

(14.15)  Proposition.  If  an  FPS  satisfies  SDT,  then  for  any  me<0,°°>, 
ke<0,m>, 

A 

a[jz(k-m;  k)  ] 

Proof : By  (13.7),  a[z(k-m;  k)  ] £ a[£(k-m;  k-((mr£)-l]  £,)  ] 

A 

• a[z(k-((mH)-l)£;  k-((mH)-z)l]  • ...  • a[z(k-£;  k)  ] < am^  t 

(14.16)  Theorem.  Consider  an  FPS  satisfying  SDT,  along  with  the  of 
memory  set  M - {Z  OZ  ).  Let  tt  : M -*■  11^  be  a mapping  satisfying: 
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tt(z)P(z)  i 0,  zezT>Z+ 

^ (m— 1)*  + 

tt(z)  - it  (0) , zeZv  ' nz 

a in 

Define  n(z)  = T(tt(z),  z) . Then  A[n(k),  n(z  (k))]  <_  a 

Proof:  If  k<m,  then  ^m(k)eMm  \ and  n(k)  = n(k).  But  if  k _>  m, 

then  £(£m(k))  = m,  and,  using  (13.6)  and  (14.15),  A[n(k) , h(k))] 

A 

- A[T(n(k-tn),  zm(k)),  T(^(zm(k)),  zm(k))]  < a[zm(k)]  < a*1"*  t 

Interpretation:  There  is  a finite-memory  observer,  requiring  no  more 

than  (# Z)m  essential  memory  states  which  generates  estimates  of  the 
information  vector  lying  within  a of  its  true  value  (in  a A 
sense) . 


Generalization:  The  approximate  relationship  between  essential  memory 

m and  maximum  error  £ is 


£ =*  m 


- r1/T 


However,  the  strict  bounds  are 


(14.17) 


£ < (m/(#Z)  VT 
m < (e/a) 


(14.18) 
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Specifically , this  means  that  no  more  than  (e/a)  essential 

memory  states  are  required  to  maintain  a maximum  error  below  e,  and 
that  m essential  memory  states  can  achieve  a maximum  error  bounded 
l ~T 

above  by  (m / (#Z)  ) 


e.  Weak  Detectability 

k* 

(14.19)  Definition.  If  k is  an  integer  and  $ : Z -+■  U,  then  for 

k 

any  z = (ux ,y1) (u2 ,y2)  ...  (uk,yk)eZ  define: 

!1,  if  Uj+1  = (Mu^y^  ...  (Uj.yj)].  je<0,k-l> 

0 otherwise 


Interpretation:  o [_z , ] =1  if  ^(0 ;k)  = z_  can  evolve  when  inputs  are 
selected  according  to  the  rule  u(k)  = 4> [ z^( 0 ; k)  ] . Thus,  if  "tt (0)  is  the 
initial  state  probability  vector,  and  inputs  are  selected  according  <}>, 
then  the  probability  distribution  for  random  variable  ^(0;k)  is: 

Prob{z/0;  k)  * z)  = o[z _,  4> ) (tt  (0) P (_z)  1) 

(14.20)  Definition. 


“k 


max.  _ max 
ieS 


<J>eU 


k*  k o[z.,4>](eiPU)l)a[^]"| 

(Z  ) zeZ 
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av  = max.  „ max  , . fl  . 0[z,  <t>]  (e*P(*)l)a[z]"l 

*eu<z  >UeZ  ' ' J 

Interpretation:  ci^  is  the  largest  possible  value  of 

E {a[z(k-£;  k) ] } where  y is  a feasible  strategy.  a.  likewise  is  the 

y — x, 

expectation  of  a[jz(k-£.;  k)]. 

(14.21)  Definition.  An  FPS  satisfies  the  condition  of  weak  detecta- 
bility (WDT)  if  there  exists  an  integer  Z such  that  a < 1. 

(14.22)  Definition.  For  an  FPS  satisfying  WDT,  define 

• H = minH  • ot£<l} 

• % 

* 1 " 

• t - (-log  ot)/(£log  //Z) 

Remark:  By  (14.21),  a<l. 

Remark:  If  an  FPS  satisfies  WSR,  then  definitions  (14.8)  and  (14.22) 

A A 

are  consistent.  If  an  FPS  satisfies  SDT  then  1 <_  1 and  if  Z - Z 
then  a < a . 
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(14.23)  Proposition.  If  an  FPS  satisfies  WDT,  then  for  any  ie<0,°°>, 

ke<0,m>,  , and  any  feasible  strategy  y, 

E^{a[£(k-m;  k)  ] } <_  a*0 

Proof : Consider  a transformed  system  in  which  the  input  is  a mapping 

(£_1)  _ 

•'  Z -*■  U,  specified  at  intervals  of  l time  units,  each  of 

which  describes  u(k),  u(k+l),  ...»  u(k+£-l)  as  functions  of 

e^  z^(k;  k+1),  £(k,  k+2) , ...  ^(k,  k+£-l)  respectively.  The  output  at 

time  k is  z^k-£;  k) . This  transformed  system  satisfies  WSR;  the 

desired  result  follows  from  (14.9).  t 

(14.24)  Theorem.  Consider  an  FPS  satisfying  WDT  along  with  the  memory 

Um*  ^ 

Z ) . Let  it  : M JI^  be  a mapping  satisfying: 

|tt(£)P(z)  j 0,  £CZnZ+ 

|tt(z)  = tt  (o) , £cz(m-1)nz+ 

A 

Define  n(z)  = T(tt(£),  z)  . Then,  for  any  feasible  strategy  y, 
EY(A[n(k),  n(zM(k))]}  < a“u 

Proof : If  k<m,  then  ^m(k)eZ^in  ^ and  rj (k)  » rj(z^(k)).  But  if 

k > m,  then  z.m(k)eZm  and  using  (13.6)  and  (14.23), 
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E {A[n(k),  n(zm(k))]}  - EY{A[T(n(k-m),  Ak)),  T(^(zm(k)),  zm(k)  ] } 

£ EY{a[zm(k)]}  < + 

Interpretation:  There  is  a finite-memory  observer,  requiring  at  most 

(#Z)m  essential  memory  states,  which  generates  estimates  of  the  infor- 

UJT-X 

mation  vector  lying  on  the  average  within  a of  its  true  value 

(in  a A sense). 


Generalizations:  The  approx Imnto  relationship  between  essential 

memory  m and  mean  error  e is: 


(14.25) 


However , 


the  strict  bounds  are: 

J "T 

e < (m/#Z  ) 


m < (7/a)*1/T  C1*-26) 

— — -1/t 

Specifically,  this  means  that  no  more  than  (e/a)  essential  memory 

states  are  required  to  maintain  a mean  error  below  e,  and  that  m 
essential  memory  states  can  achieve  a mean  error  bounded  above  by 

I “T 

(m/»Z ) . 
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15.  Decomposition  of  a Free  FPS  Into 
Detectable  Parts 

This  section  is  concerned  with  FPS's  that  are  not  detectable.  An 
example  of  such  a system  was  given  in  Section  5a.  An  FPS  fails  to  be 
detectable  when  some  function  of  the  (internal  or  augmented)  state  may 
be  recursively  updated,  but  is  never  identified  exactly.  This  function 
depends  on  the  input  process,  and  for  this  reason,  the  decomposition 
of  an  FPS  into  detectable  parts  is  meaningful  only  in  the  case  of  a 
free  FPS. 

(15.1)  Definition.  (a)  C^(k)  = {j  : P (z(0;k))  >0}  C S 

(b)  C(k)  = (Ci(k)  : ies}  - {0} 

(c)  u(k)  = //C(k) . 

Interpretation.  C^(k)  is  the  set  of  possible  present  internal  states 
given  that  s(0)=i.  C(k)  is  the  set  of  possible  state  configurations 
which  may  result  from  specification  of  the  initial  state.  In  a 
detectable  system,  p(k)-*l. 

I 

(15.2)  Proposition.  (a)  (^.(k+l)  ■ {j  : P (y(k+l)|  u(k))>0, 

ieClt(k)} 


(b)  C(k+1)  = {{j  : P (y (k+1) | u (k) ) >0 , iec', 

C'eC(k) } - {0} 

(c)  y(k+l)  < p(k). 

Consider  a free  connected  FPS,  i.e.  one  whose  underlying  process 

has  an  entirely  recurrent  state  set.  If  pairs  [ C (k) , s(k)] 
are  considered  in  place  of  the  internal  state,  recurrent  chains  of 

such  pairs  may  be  determined.  By  (15.2) (c),  p(k)  is  constant  within 
each  recurrent  chain.  If  every  recurrent  chain  is  such  that  u(k)=l, 
then  the  system  satisfies  WDT,  because  if  C(k)  is  at  any  time  reset 
to  { { i } : ni(k)  > 0},  it  will  tend  to  a value  containing  one  element, 
indicating  that  the  word  of  intervening  input-output  pairs  had  a 
subrectangular  transition  probability  matrix.  On  the  other  hand,  if 
p(k)  remains  greater  than  one  for  all  time,  then  subrectangular  input- 
output  words  cannot  occur. 

If  the  free  connected  FPS  is  such  that  M(k)  need  not  tend  to  one, 
then  the  process  can  be  described  as  one  of  at  most  N detectable  models, 
which  may  be  asymptotically  identified.  This  decomposition  is  effected 
by  allowing  p(k)  to  reach  its  minimal  value,  and  by  then  assuming  that 
the  current  state  lies  in  a particular  element  of  C(k).  This  determines 
the  element  of  C(k)  containing  the  current  state  at  all  times,  and  the 
likelihood  of  a particular  model  can  be  updated  periodically.  Since 
only  one  model  is  correct,  its  likelihood  will  approach  one  - unless 
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some  models  are  Identical,  in  which  case  it  doesn't  matter  which  is 
identified. 

Note  that,  in  order  to  determine  whether  a free  connected  FPS 

is  detectable,  one  determines  whether  the  process  { f i : ry(k)>0}} 

g 

(which  equals  C(k)  if  y(k)=l)  is  simply  connected  in  2 . This 
illustrates  a duality  between  the  notions  of  connectivity  and  detecta- 
bility. 

The  decomposition  procedure  is  readily  extended  to  general  free 
FPS's.  Transient  states  may  be  igttOfed  since  information  vector  entries 
corresponding  to  transient  states  have  expectation  that  vanishes 
geometrically  as  the  number  of  available  (most  recent)  input-output 
pairs  increases,  and  contemplation  of  an  infinite  past  eliminates 
transient  states  at  time  zero.  If  the  free  FPS  has  more  than  one 
recurrent  class,  then  the  test  for  detectability  is  performed  on  the 
system  restricted  to  one  recurrent  class  at  a time;  certain  recurrent 
classes  may  be  identified  exactly  on  the  basis  of  a particular  output 
configuration  (that  eventually  occurs);  others  may  be  identified  on  the 
basis  of  the  infinite  past;  still  others  may  be  identical  from  an  input- 
output  point  of  view. 

Since  the  decomposition  depends  crucially  on  a classification  of 
states  as  transient  or  recurrent,  it  cannot  be  extended  to  FPS's  with 
inputs;  in  practical  applications,  though,  it  often  suffices  to  consider 
the  free  system  under  a particular  adapted  strategy. 


16.  Stochastic  Realization  of  a Free  FPS 


The  stochastic  realization  problem  includes  that  of  deciding 
whether  or  not  a given  free  FPS  is  equivalent  to  a state-calculable 
one.  Such  a property  would  be  desirable  because  it  would  indicate 
that  after  a sufficiently  long  initial  identif ication  procedure,  the 
present  state  could  be  arbitrarily  closely  known,  and  the  optimal 
strategy  in  the  steady-state  could  be  computed  by  assuming  that  the 
internal  state  was  known  exactly.  This  property  would  be  equivalent 
to  the  following  condition:  (n(k)}  has  a finite  number  of  cluster 

points  in  with  probability  one.  It  will  be  suggested  here  that 

such  is  generally  not  the  case. 

(16.1)  Theorem.  For  a given  free,  connected,  strongly  subrectangular 
FPS  in  minimal  state  form  (Paz  [1971]),  the  following  statements  are 
equivalent : 

(a)  The  FPS  is  equivalent  to  one  that  is  state  calculable. 

(b)  The  process  (z^k-N (N-l) /2 ; k) } is  a Markov  chain. 

Proof : Assume  first  that  every  matrix  of  the  form  P(z)  , zeZ 1)/- 

has  rank  zero  or  rank  one.  Then  (a)  and  (b)  trivially  follow. 

Now  assume  that  there  is  a zeZ^^  such  that  has  rank 

greater  then  one.  Then  there  is  a z eZ+  and  i,  jeS  such  that 
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Pii(i>  > o 

Pjjci)  > 0 

and,  naturally,  P(z)  has  rank  greater  than  one , and  It  is  subrectangular 
(by  SSR) . By  Perron's  theorem,  P(z)  has  a left  eigenvector  tt 
corresponding  to  the  eigenvalue  of  largest  magnitude,  and  satisfying 
TTj  > 0,  jeJ(z).  Consider  the  set  {T(u,  (z)k)  : ke<l,®>}.  Clearly 
this  set  either  contains  exactly  one  element  or  else  it  consists  of  an 
infinite  number  of  distinct  elements.  Using  the  word  selected  above, 
define  u(z)  = T(tt,  z")  . For  any  zcZ  v ' such  that  P(z)  has 
rank  one,  define  f|(z)  = TCe^z)  for  any  iel(z). 

Now,  if  it  is  true  that,  for  any  z^ , z^rZ^^  ^^^nZ+, 

T(fi(z1),  z2)  = fi(z2) 

then  (a)  and  (b)  follow  trivially.  On  the  other  hand  if 

T(fl/(zl),  z2)  i fi(z2) 

for  some  , z2cZ^^  then  an  infinite  number  of  distinct 

2 

possible  information  vector  values  exist  (by  decomposing  z_  in  the 
manner  described  above),  and  (a)  and  (b)  are  both  false.  t 

An  algorithm  based  on  the  proof  of  (16.1)  decides  whether  or  not 
a free  FPS  is  equivalent  to  one  that  is  state-calculable.  A similar 
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algorithm  will  perform  the  same  test  for  an  arbitrary  FPS.  The  FPS  is 
first  decomposed  into  connected  detectable  components,  following  the 
analysis  in  Section  15.  The  possible  information  vector  values  are 
then  enumerated.  However,  whenever  an  information  vector  value  results 
from  a transition  having  subrectangular  probability  matrix  of  rank 
greater  than  one,  this  information  vector  must  coincide  with  the  Perron 
eigenvector  for  that  transition  probability  matrix.  Since  the 
enumerations  are  performed  on  extremely  large  sets,  this  decision 
algorithm  is  computationally  infeanible  in  all  but  the  trivial  cases. 

At  the  same  time,  it  should  be  clear  that  in  very  few  cases  will  the 
FPS  actually  be  equivalent  to  a state-calculable  system. 

A more  practical  approach  to  stochastic  realization  is  to  appro- 
ximate the  FPS  by  a system  whose  state  is  the  memory  state  induced  by 
a large  memory  set.  This  FPS  is  state-calculable  because  memory  states 
may  be  recursively  computed,  and  the  closeness  of  the  approximation  may 
be  established  by  detectability  arguments. 


V 
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CHAPTER  III 


The  finite-horizon  partially-observable  Markov  decision  problem 
was  solved  by  Sondik  [1971].  His  results  are  reviewed  here,  in  slightly 
modified  form. 

Sondik  showed  that  every  f inite-horizon  problem  has  an  optimal 
finite-memory  solution.  This  may  be  demonstrated  in  a number  of  ways. 
One  of  these  is  to  argue  that  the  information  vector  assumes  values  of 

_»  k A 

the  form  (T(tt(0),  z)  : zeZ  (Ti(0))nZ  }.  Since  this  is  a finite  set, 
the  problem  may  be  restated  as  a finite-horizon  Markov  decision  pro- 
blem with  perfect  state  observation,  where  the  memory  state  z^O;  k) 
is  regarded  as  the  state  variable.  The  optimal  policy  will  then  deter- 
mine the  input  on  the  basis  of  this  memory  state.  A dual  argument 
states  that  at  any  time  k,  the  remaining  strategy  (given  the  present 

time  and  information  state)  can  be  expressed  as  a policy 
(k-k)* 

,,  . , : Z ■+  U.  Since  there  are  only  a finite  number  of 

[k,n(k)] 

these,  the  optimal  input  may  be  computed  by  enumeration.  A computa- 
tional procedure  which  is  based  on  the  latter  argument  is  now  described. 

Consider  a modification  of  the  finite-horizon  FPS  control  problem 
in  which  the  information  vector  is  regarded  as  a perfectly-observed 
state  variable.  The  expected  incremental  reward  at  time  k takes  the 
form: 


E{r(k)  I n(k) , u(k) } = n(k)q(u(k)) 


-125- 


The  problem,  consequently,  Is  to  maximize  the  performance  index 

E{z£=0b(k)  n(k)q(u(k)) }.  Application  of  Bellman's  Principle  of 
Optimality  yields 

Vk“1,K[ir]  = maxueU{b(k)Trq(u)  +EyeY(7rP(y  |u)l)vk,K[T(7r,u,y)  ] 

k,K,  \ „ 

v (tt)  = 0 

(17.2) 

k Y 

where  v ’ ' is  a real-valued  function  on  JI„  representing  the  value  of 

N 

being  in  a particular  information  state  at  time  k for  a problem  with 

k K ~ 

horizon  K.  For  ease  of  notation,  extend  the  domain  of  v ’ to 
by  defining 


Now  define  finite  subsets  of  R : 

N 
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Wk  1,K  * {q (u)  + E P(y|u)w  : ueU,  w cWk,K,  yeY}} 
yc  i y y 


/’K  = {0} 


(17.5) 


Eq.  (17. A)  may  now  be  expressed  as 


k )K  r . r . ,k  >K  i 

v [tt]  = maxiirw  : weW  } 


(17.6) 


k K 

Thus,  each  function  v ’ is  convex  and  piecewise-linear  with  a finite 

k K 

number  of  faces.  Each  region  Of  11^  throughout  which  v ’ is  linear, 

is  a region  where  the  strategy-to-go  is  constant;  thus  the  elements  of 
k K 

W ’ may  be  viewed  as  controller  states.  Specifically,  if 

v * [tt]  = ttw  and  w = q(u)  + E p(y|u)Wy,  then  an  optimal  controller 

faced  with  information  vector  tt  at  time  k selects  input  u and  is 
k+1  K 

assured  that  v ’ [n(k+l)]  = n(k+l)w 


,k,K 


y(k+l) 


The  size  of  each  set  W ’ may  be  reduced  by  eliminating  elements 

that  correspond  to  memory  states  which  can  never  be  reached.  Specifi- 
k K k K 

cally,  if  weW  ’ is  such  that  min  „ {v  ’ [tt]  - wtt}  > 0,  then  w can 

7T£Un 

k K 

be  eliminated  from  W * without  loss  of  generality.  This  test  is 

effected  through  the  solution  of  a simple  linear  program. 

Of  course,  this  solution  procedure  is  not  necessarily  applicable 

0 K 

to  infinite-horizon  problems,  because  the  size  of  W ’ can  increase 

without  bound  as  K-* 00  . Drake  [1962,  1968]  and  Sondik  [1971]  have  noted 

0 K 

that,  in  certain  problems,  W * converges  (except  for  a constant  gain) 
in  a finite  number  of  iterations;  a finite-memory  realization  of  the 
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infinite-horizon  optimal  controller  is  thus  obtained.  Although  it  is 

true,  in  the  infinite-horizon  problem,  that  existence  of  a finite- 

memory  realization  of  the  optimal  controller  implies  that  the  value 

function  is  piecewise-llnear  with  a finite  number  of  faces,  this  does 

0 K 

not  in  turn  imply  that  the  number  of  faces  in  the  approximations  v ’ 

0 K 

is  bounded.  Thus  #W  ’ may  diverge  as  K -*•“>,  although,  in  the  limit, 

a piecewise-linear  relative  value  function  with  a finite  number  of 

0 K 

faces  is  approached.  Furthermore,  many  of  the  faces  in  W ’ may 
correspond  to  transient  memory  states.  In  the  Machine  Maintenance 
and  Repair  Problem,  the  optimal  value  function  is  characterized  by 
well  over  thirty  faces,  only  eight  of  which  are  required  to  realize  an 
optimal  controller. 


\ 


18.  State-Observable  Problems 


A state-observable  FPS  Is,  of  course,  equivalent  to  a Markov 
decision  process.  This  section  reviews  known  methods  for  its  solution; 
additional  references  are  given  in  Section  4.  Since  y(k)  uniquely 
determines  s(k) , P^ (u)  will  denote  £yeY  j (y I u) * 

The  finite-horizon  problem  is  solved  by  computing  value  functions : 


vk_1,K(i)  - "aVU{b(k)‘>i<“>  + EJes  Plj(u)vk-K(J» 
vK’K(i)  - 0 


(18.1) 


The  optimal  decision  at  time  k-1,  for  a system  in  state  i,  is  the  input 
u which  maximizes  (18.1).  Thus  the  optimal  strategy  selects  inputs  on 
the  basis  of  current  state  and  time  alone. 

If  b(k)  = 8^,  then  v^’^  = B^v^  ^ where 


v"(i)  - “VuS'"'  + 6IJeS 
v°(i)  - 0 

As  m •♦<*>,  v“  approaches  a limit  v*  satisfying: 


(18.2) 


v*(i)  = maxuelJ{q1(u)  + B^jeS  Pij(u)v*(j)} 


(18.3) 
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Thus  the  optimal  strategy  in  the  infinite-horizon  discounted  problem 
determines  inputs  on  the  basis  of  the  present  state  alone. 

Eq.  (18.3)  can  be  solved  by  computing  the  sequence  {v™}  accord- 
ing to  (18.2).  This  computational  procedure  is  called  value  iteration. 

If  8 is  large  (i.e.  near  unity),  then  computational  instability 
may  occur.  This  difficulty  is  avoided  by  defining: 

g ■ (l-8)v*(N) 

(18.4) 

v*(i)  * v*(i)  - v*(N) 


Eq.  (18.3)  now  becomes 

v*(i)  «■  n>axu£;U  (q^u)  + (u)v*(j)  } - g (18.5) 

v*(N)  ■ 0 

The  function  v*  is  called  a relative  value  function,  and  g*  is  called 
the  average  gain.  This  follows  from  the  decomposition: 


v*(i)  - v*(i)  + - v*(i)  + Ek-Q6kg 


Eq.(18.5)  might  be  solved  by  White's  algorithm 


‘ maxueu(qi(u)  + 6ZjeS  Pij(u)0o'1(j)} 


v“(i)  - vm  x(i)  - v"  Z(N) 


(18.6) 


V 


«°<i>  - 0 


-130- 


On  the  basis  of  v"1  and  v™,  MacQueen  bounds  on  g may  be  computed: 

mlnieS  *>>  “ v“(i)]  £ g <maxieS[v”(i)  - Vg(i)]  (18.7) 

Eq  (18.5)  may  also  be  regarded  as  a linear  program: 
min:  g 

subject  to:  v*(i)  •>  q^u)  + 8£.jeS  Pij(u)v*(j)  - g,  ieS,  ueU 

v* (N)  = 0 (18.8) 

As  it  turns  out,  an  optimal  basic  solution  will  satisfy  (18.8)  with 
strict  equality  for  exactly  one  input  corresponding  to  each  state. 

Thus,  an  optimal  policy  is  obtained. 

Now  consider  the  infinite-horizon  undiscounted  problem.  When 
8=1,  {v™}  is  not  guaranteed  to  remain  bounded;  and  even  if  it  remains 
bounded,  it  is  not  guaranteed  to  converge.  Boundedness  occurs  if  the 
average  gain  does  not  depend  on  the  initial  state,  and  convergence  : 
occurs  if  the  optimal  system  if  aperiodic. 

Assume  first  that  {v™}  is  bounded.  Then  difficulties  relating 
to  convergence  are  avoided  by  defining  the  problem  as  a limit  of  dis- 
counted problems  as  8+1.  Thus  a solution  to  the  linear  program 

min:  g 

subject  to:  v*(i)  > q^u)  + P1j(u)v*(j)  - g, 

ieS,  ueU 


\ 


\ 


v* (N)  - 0 


(18.9) 
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is  sought.  Computationally,  convergence  is  assured  by  Schweitzer's 
(damped  value-iteration)  algorithm 

vm_1(i)  - maxueU{qi(u)  + IjeS 

vm(i)  - g[vm-1(i)  - v“_1(N)]  + (1-B)^m~1(i)  (18.10) 

v°(i)  = 0 , 0 < 8 < 1 

Odoni  bounds  on  g may  be  computed 

minleS[vm(i)  - 0m(i)]  < g ± maxiEg [vm(i)  - vn(i)]  (18.11) 

Simple  connectivity  is  a sufficient  condition  for  {v™}  to  be 
bounded.  A general  Markov  decision  problem  may  be  solved  by  decom- 
posing it  into  simply  connected  subproblems,  as  described  below: 

(18.12)  Algorithm  (Solution  of  a Markov  decision  problem) 

Step  1.  Let  S denote  the  "remaining  region  of  S"  and  set  S=S. 

Let  U(i)  denote  the  admissible  input  set  when  the  system  is  known  to 
be  in  state  i,  and  set  U(i)  **  U,  ieS.  Also  set  g(i)  = Q,  , ieS. 

Step  2.  Determine  a connected  class  C in  (S,U),  the  Markov 
decision  process  with  state  set  restricted  to  S and  input  set 
restricted  to  U(i)  when  the  system  is  in  state  i.  Since  S is  nonempty, 
such  a connected  class  exists. 

Step  3.  Solve  the  Markov  decision  problem  within  (C,U)  to  obtain 
a gain  g.  Set  g(i)  - g,  VieC. 
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Step  4.  Set  S = S-C.  For  every  triplet  icS,  ueU(i) , jtS-S, 
that  satisfies  P (u)>  0»  set  U(i)  = U(i)-{u}.  If  U(i)  ■ 0,  then 
set  S = S— { i } . Repeat  this  elimination  process  until  S,  (U(i)  : ieS} 

have  been  minimized.  If  S is  nonempty,  then  return  to  step  2. 

Step  5.  Solve  the  system  of  equations: 

g(i)  = max(g(i),  maxucUlEjfS  (u)g(j)  1)  • 

This  may  be  done  by  value  iteration: 

gm(i)  = max (g(i) , max^  [£JeS  P^  (u)  gm_1(  j)  ] ) 
g°(i)  = g(i). 

or  by  solving  the  linear  program: 
min : e 

subject:  g(i)  _>  Ejeg  P (u)g(j)  - e 

g(i)  > g(i) 

Step  6.  Set  U(i)  - {u  : g(i)  = E Pij(u)g(j)},  and 
^(u)  * qt(u)  - g(i) . 

Now  solve  the  Markov  decision  problem  with  incremental  rewards 

qfu)  and  admissible  input  set  U(i)  while  the  system  is  in  state  i. 

1 . 
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Since  the  average  gain  has  been  substracted  from  the  incremental 
rewards,  the  transformed  system  has  gain  zero,  and  within  any  class 
of  states  for  which  g(i)  is  the  same,  the  correct  relative  values 
will  be  obtained. 

Remark:  The  policy  determined  in  Step  5 is  gain-optimal.  Step  6 

is  necessary  only  if  bias  optimality  is  desired  as  well. 

I 


J 
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19.  Existence  of  a Solution  In  General 
Infinite-horizon  Problems 

This  section  is  concerned  with  well-posedness  of  optimization 
problems  formulated  in  Chapter  I.  Its  purpose  is  to  establish  con- 
ditions under  which  an  optimal  strategy  exists.  In  the  present 
analysis,  the  optimal  strategy  need  not  satisfy  a finite-memory  con- 
straint. 

A sufficient  condition*  for  existence  of  an  optimal  strategy  is 
that  there  exist  a solution  to  the  infinite  dimensional  linear  program: 

v*(tt)  = max^y{TTq (u)  + 6^yrY(TTP(y(u)l)  v* (T(iT,u,y) ) } - g* 

(19.1) 

If  the  relative  value  function  v*  exists,  then  there  is  a function  <J>* 
which  describes  the  input  maximizing  (19.1)  as  a function  of  tt.  If 
<p*  is  used  to  select  inputs  on  the  basis  of  the  information  vector, 
then  the  optimal  gain  g*  will  be  achieved.  <p*  will  be  called  an 
optimal  feasible  policy. 

(19.2)  Definition.  An  infinite-horizon  FPS  control  problem  is  called 
regular  if  it  is  either  discounted  or  both  simply  Conner  ted  and  detec- 
detectable. 

*The  straightforward  proof  parallels  well-known  arguments  for  the  state- 
observable  case;  see  Kushner  [1971]. 
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(19.3)  Theorem.  Suppose  either  (a)  that  6<1  or  (b)  that  the  FPS 


satisfies  conditions  of  connectivity  and  (weak)  detectability. 

If  connectivity  holds,  then  let  £p  and  p be  as  in  (11.5);  otherwis 
define  i.  = p = 1.  If  weak  detectability  holds,  then  let  £ and  a 
be  as  in  (14.22);  otherwise  define  i = a = 1.  Finally,  define 


(1-6  ) / (1-6) , if  6<1 


L(6,£) 


y£.-l  Rk 
Ek=0  B 


if  6= 


(19.4) 


and 


l($, Up  + Z)Q  

1_g(f,p+I)(1_p)(1_- 


Then  there  exists  a solution  v*  to  (19.1)  having  the  following 
properties : 

(i)  v*  is  continuous  throughout  11^ 

(ii)  v*  is  convex 

dii)  ||v*||D<n 


Remark.  If  6=1,  then  ft 


(£.p+£ ) 0 

(l-P)(l-a) 


Heuristic  Justification:  Only  the  undiscounted  (6=1) , strongly  con- 

nected (£p  = 1),  weakly  subrectangular  (£.=  1)  is  considered  here. 
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A solution  v*  = lim  v™  is  constructed  by  damped  value 

m+® 

iteration  (18.10)  where,  following  (17.3), 

vD>+1(tt)  = 1/2  vm(ir)  + 1/2  maxueU  {frq(u)  + Ey£Y  vm(TrP(y  | u) ) } 


v^(rr)  = 0 


(19.6) 


Then 


m 


v 


U/2)k  (“) 


(19.7) 


where  is  the  finite-horizon  value  function  when  k decisions 

It 

remain.  Each  v^  is  convex  by  an  arguments  given  in  Section  17. 

It  is  now  demonstrated  (by  induction)  that  ||  v™||  ^ £ Q . Since 

v™  is  convex,  it  achieves  a max  mum  at  some  vertex  of  IT^  . Let  j be 

m i 

the  state  that  maximizes  v (e  ) and  let  u*  be  the  input  that  maximizes 
(e^q(u*)  + vg  * (e^P(y |u*) ) } , i.e.  j is  the  most  desirable  initial 

state  for  an  m-transition  problem  and  u*  is  the  first  optimal  input 
for  such  a problem  when  the  initial  state  is  known  to  be  j.  Then, 

V>)  1 TTq(u*)  + ryeY  v“  1(-rrP(y  |u*))  , • 

Now: 

m,  j m. 
vQ(e  -vq(tO) 

_<  (e^q(u*)  + £ycYvo  1(e^P(y  |u*))  J 

ycY  vo'1<1TP(ylu*)>} 


- {nq(u*)  + E 


- Q - Q.  + £ v[(eJP(y|u*)l) 
max  min  yeY 


v“  (T(e  ,u*,y)) 


- (TtP(y|u*)l)  v®  (T(TT,u*,y))  ] 

Q + TTj  IycY  (eJP(y|u*)l)  v™  1(T(e^  ,u*,y))  - v®  1(T(TT,u*,y) ) 


+ (l-Trj)ZycY[(ejP(y|u*)l)  v®_1(T(eJ  ,u*,y) ) 


(tt-tt  eJ) 


(1"V 


P(y|u*)l  v”_i(T(TT,u*,y))] 


1 Q + 1Tj^yey  (eJP(y|u*)l)a[(u*,y) ] ||  v™  1 1|  D + (l-TTj)||  v“  1 1|  j- 


< Q + [1-TTj  (1-a)  ] ||  v®  ||  D 


(19.8) 


for  any  irell^  , 

vIIH’1(tt)  = n)axufU  {itq(u)  + Eycy  vm(irP(y|u))  < Qmax  + v®(e^) 


and,  letting  u be  the  input  for  which  (y | G)  < 1-p, 


(19.9) 


vm+1(TT)  TTq (u)  + EycY  vm(7TP(y|u)) 

> ^min  + vO(EyeY  "P(y,G)) 


i + v0(eJ)  - Q - U-U-P)«-.)]||v;-1|lD 


L9. 
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II  ^”11  D = II  vm||  D £ » me<0,®>  (19.12) 

° (l-p)(l-a) 

The  damped  value-iteration,  (19.6),  assures  that,  if  {v"1}  has  any 
(pointwise)  limit,  then  it  converges  uniformly  to  that  limit;  the  sequence 
{v™}  has  a limit  because  it  is  convex  and  bounded;  thus  v*  exists  and  is  a 
solution  to  (19.1).  v*  is  convex  and  bounded,  by  convexity  and  boundedness 

of  {vm}. 

Continuity  of  v*  is  most  readily  established  in  strongly  subrec- 
tangular  systems.  Here, 

(TTq(u)  + IyeY(TTP(y  |u)l)v*(T(ir,u,y))  } 

is  continuous  in  tt  for  each  ueU,  because  (T(ir,u,y)  : lies  in 

the  interior  of  a face  of  fl^  (see  Figure  5-1)  and  a convex  function 
is  always  continuous  over  a relatively  open  subset  of  its  domain.  Thus 
the  right-hand  side  of  (19.1)  is  continuous,  and  v*  is  continuous. 
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Proof : The  complete  proof  of  (19.3)  is  given  in  Appendix  A. 

j* 

(19.13)  Corollary.  Let  e be  the  information  vector  of  maximal 

value,  in  a connected,  detectable,  FPS  control  problem.  Let  £n,p 

£ P 

be  such  that,  for  any  > there  exists  an  input  word  u£U  P satis- 

fying: 

1_EIeYA(G)ZieSiripij*^  - p 
Then  ||  v*||  £ ^ , where  fl  is  given  by  (19.5). 


Interpretation:  ||  v*||  may  be  bounded  on  the  basis  of  reachability 

of  the  most  valuable  state  alone.  In  a network  of  queues,  the  most 
valuable  state  is  readily  identified  without  solving  the  problem;  (it 
is  the  state  in  which  all  queues  are  empty).  In  this  manner,  a tighter 
bound  on  ||  v*||  Is  obtained. 


(19.14)  Theorem.  Consider  a regular  FPS  control  problem.  If  the 

system  is  simply  connected,  then  let  £ ,a  be  numbers  such  that  the 

L>  L 

internal  state  enters  the  connected  class  with  probability  l-a^,  or  more 
after  £ transitions  and  let  Cl  be  as  in  (19.3)  for  the  system  restricted 
to  C;  otherwise  define  £ ,a  =0,  and  let  Cl  be  as  in  (19.3)  for  the  sys- 

L*  V_* 

tem  as  specified.  Then  there  exists  a continuous,  convex,  bounded, 
relative  value  function  v*  satisfying  (19.1),  such  that 


II  V*IID  1 + 


-140- 


Proof:  It  is  necessary  only  to  demonstrate  boundedness  of  values 

{v™}  in  the  proof  of  (19.3).  Now 

maxlf  g(v"1(ei)}  <_  «,CQ  + otCmaxlrS{vm(ei)}  + (l-a^max^tv^e1)  } 


and  so: 


max 


ies{vm(el)}  " maxiec{vm(ei)}  -H 


Consequently,  arguments  given  in  the  proof  of  (19.3)  show  that  v* 
satisfies  the  desired  conditions. 
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20.  An  Alternate  Formulation  for  Irregular  Problems 


Consider  the  following  problem,  to  which  no  optimal  solution 
exists. 


(20.1)  Example.  U = {1,2},  Y = {l> , N = 3,  tt(0)  = (0,0,1)  and 


P(lll)  = 

1 

0 

0 0 
1 0 

, P(1 | 2)  = 

1 

0 

0 0 
1 0 

0 

.5  .5 

.5 

0 .5 

The  incremental  reward  vectors  are: 


M 

0 

q(l)  = 

1 

o o 
1 

, q (2)  = 

0 

0 

The  performance  index  is  infinite-horizon  undiscounted  average  reward. 
A suboptimal  solution  may  be  obtained  by  the  following  argument:  if 

any  reward  at  all  is  to  be  achieved,  then  the  system  must  be  made  to 
enter  state  1,  through  initial  application  of  input  2.  Once  state  1 
has  been  reached,  input  1 should  be  applied  at  all  times.  Unfor- 
tunately, there  is  no  way  for  the  controller  to  learn  whether  state  1 
has  been  entered.  If  input  2 is  applied  n times  and  input  1 is 
applied  thereafter,  the  performance  l-(.5)°  is  achieved;  this  may  be 
made  arbitrarily  close  to  1.  The  supremum  feasible  performance  g 
can  never  be  attained:  if  input  2 is  applied  at  all  times,  then  the 

gain  will  be  zero;  and  if  input  1 is  applied  once,  at  time  k,  then  the 
system  enters  state  2 with  probability  (.5)  and  the  performance 


cannot  exceed  l-(.5)  . 
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A well-known  class  of  problems,  to  which  no  solution  exists,  is 
the  f inlte-memory  hypothesis  testing  problem  with  choice  of  experiments, 
also  known  as  the  N-armed  bandit  problem.  In  the  two-armed  bandit 
problem,  a gambler  is  confronted  with  two  slot  machines.  For  each  coin 
invested,  one  machine  returns  two  coins  with  probability  .6,  none  with 
probability  .4;  and  the  other  machine  returns  two  coins  with  probability 
.4,  none  with  probability  .6.  It  is  not  known  initially  which  machine 
is  the  more  favorable. 

Failure  of  an  optimal  strategy  to  exist  is  a consequence  of  the 
infinitely-delayed  splurge  phenomenon  discussed  in  section  5a.  This, 
in  turn,  results  from  null-transitivity  of  certain  information  states 
in  a system  that  is  not  detectable.  Specifically,  infinitely-delayed 
splurges  may  occur  when: 

(i)  Under  e-optimal  strategies,  for  e sufficiently  small, 

Vi (k) >1 ; i.e.  there  are  recursively-computable  functions 
of  the  state  that  may  be  interpreted  as  one-time  hypotheses; 

(ii)  In  the  limit,  where  an  infinite  past  is  available,  the 
correct  hypothesis  may  be  identified  exactly,  and  a 
detectable  problem  results; 

(iii)  The  cost  of  identifying  an  hypothesis  is  infinite. 


-143- 


Such  problems  may  be  solved  in  two  steps,  described  below. 

Step  1 (steady-state) 

Under  the  assumption  that  the  state  was  exactly  known 
at  some  point  In  the  infinitely  distant  past,  the  problem 
becomes  detectable,  and  an  optimal  strategy  exists.  This 
strategy  might  not  satisfy  a finite-memory  constraint,  but 
its  performance  may  be  approximated,  arbitrarily  closely, 
by  a finite-memory  controller  in  the  following  sense:  for 
any  C>0,  there  is  a finite-memory  controller  whose  average 
reward,  over  a given  time  interval  of  length  K,  lies  between 
g*-e  and  g*  with  probability  approaching  unity  as  K>°°. 

Step  2 (initial  identification) 

The  correct  hypothesis  may  be  arbitrarily  closely 
identified  in  a finite  number  of  transitions.  Let  the 
terminal  reward  be  1 if  the  hypothesis  is  correctly  identi- 
fied, and  0 if  it  is  not.  Then  solve  the  finite-horizon 
problem  by  the  methods  cited  in  Section  4,  or  by  the 
algorithm  of  Sondik.  (The  initialization  procedure  will 
be  described  in  greater  detail  in  Section  21f ) . 
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Thls  report  is  concerned  with  hypothesis-testing  only  to  the 
extent  that  it  occurs  in  problems  of  statistical  decision  and  control. 
As  long  as  a problem  is  detectable,  its  "dual  control"  aspects  involve 
a reasonable  tradeoff  between  information  and  control;  otherwise  the 
problem  must  be  solved  in  two  separate  steps.  If  available  memory  is 
limited,  then  it  must  be  decided  how  much  memory  is  to  be  allocated  to 
identification,  and  how  much  is  to  be  allocated  to  steady-state  per- 
formance. Note  that  memory  allocation  in  this  sense  is.  indirectly 
determined  by  the  discount  B (when  B<1)  , since  it  specifies  the  manner 
in  which  steady-state  performance  and  identification  costs  are  to  be 
compared . 
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CHAPTER  IV 


COMPUTATION  OF  e-OPTIMAL  CONTROLLERS 
21.  Perceptive  Dynamic  Programming 

a.  The  Basic  Algorithm 

It  has  been  demonstrated,  in  Section  19,  that  there  exist 
solutions  to  regular  FPS  control  problems.  Yet,  it  may  be  impossible 
to  compute  or  to  implement  solutions  that  fail  to  satisfy  a finite- 
memory  constraint.  This  flffctiOtl  introduces  a feasible  computational 
technique  for  the  solution  of  such  problems. 

In  the  computational  technique  of  perceptive  dynamic  programming , 
an  increasing  sequence  of  memory  sets,  {Mn},  is  used  to  construct 
approximations  to  the  original  problem.  Each  approximation  is  para- 
meterized by  a memory  set;  the  n-th  approximation  depends  on  memory 
set  M11,  but  the  iteration  number  n alone  may  be  used  to  facilitate 
notation.  The  approximation  corresponding  to  memory  set  M is  the 
Markov  decision  problem  that  results  when  the  augmented  system  induced 
by  M is  assumed  to  be  state-observable.  The  solution  to  this  problem 

is  called  a perceptive  solution;  it  consists  of  a perceptive  value 
M 

function  v : X[M]-*-R  and  a perceptive  gain  g[M],  obtained  by  solving 
the  system  of  equations: 

vM[i,£]  - maxuclJ{q^(i,u)  + BE^gEy^P^U , j , (u ,y) ) vM[ j ,TMOs,  (u ,y)  ) ] } 

[i,z]eX[M] 


- g[M], 


(21.1) 
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In  (21.1),  perception  of  delayed  states  Is  assumed  only  when  the 
memory  state  is  essential.  Optimal  decisions  and  relative  values  for 
the  remaining  memory  states  can  be  determined  by  solving: 


vM[tt(0),  z]  = max  ,{T(tt(0),  z)q(u) 


ueU 


+ ezycY(T(7T(o), 
/ 

z.)P(y 

u)l) 

I [EieSEjeSEkeS 

tt1(0) 

• P1^(z(u,y) 

-TM(z^,  (u,  y))) 

) • ( 

u,y))) 

• vm[j,TMU, 

(u,y))]] 

I / [ir(0)P(z)P(y  |u)2  1 , 

if  T M(z, 

(u,y))£ess[M) 

( v M [it  (0) , ^(u 

,y)  1 . 

otherwise 

-g [M] , zeZ  (tt(0))D  ess[M] 


(21.2) 


— M 

The  policy  maximizing  (21.1)  and  (21.2)  is  denoted  41 

M 

A feasible  strategy  <}>  is  devised  by  constructing  a policy 
adapted  to  M which  realizes  it.  Select  any  mapping  s : esslM”]  -►  S 
satisfying: 

s[zjel(£),  Vzeess[M]  (21.3) 

The  substitution  of  a state  guess  for  a perceived  state  will  be  called 
pseudo-perception.  Define  the  feasible  policy  to  be 
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I [ z_]  , if  zeM-ess[M] 

$ M l z.]  = | 

| i|i[s[z_]  ,zj  , if  z£ess[M] 

M 

h [ M ] will  denote  the  performance  achieved  by  <p  . Clearly: 


(21.4) 


h[M]  g*  £ g [M]  (21.5) 

For  a given  sequence  of  memory  sets  {M™},  these  bounds  may  be  denoted 

. n n _ 

n and  g , respectively. 

A key  result  is  the  ftjjltrtdttg  theorem,  which  states  that 
n , n . 

g -h  -+  0 as  n -*■ 00 . 


(21.6)  Theorem.  Suppose  either  (a)  that  6<1  or  (b)  that  the  FPS 
satisfies  conditions  of  connectivity  and  (weak)  detectability,  and 
let  Hp,p,2.,a,  L(f3,£)  and  Q be  as  in  (19 . 3)  — ( 19 . 5) . Also  let  a 
be  as  in  (14.22)  if  WDT  is  satisfied;  otherwise  define  a=l.  Then: 


[M]U  1 , [M] 

(a)  g[M]  - g*  < a mln  B 4fi 


_i  [M]  . [M] 

(b)  g[M]  - n[M]  <amn  B 


. | 


i [M]-fc  . [M] 


L(B,«.  [M]-a  . £M] ) + B 

max  mln 


minL  ( 0 


TrS 

l-B  a 


[Q+BSl] 


\ 
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Heuristic  Justification:  The  proof  follows  an  argument  given  in 

Section  5e. 

Proof : The  complete  proof  is  given  in  Appendix  B. 

The  generalization  to  systems  having  transient  states  is  straight- 
forward . 

(21.7)  Corollary.  For  any  regular  FPS  control  problem: 

* . [M]il  4 [M] 

g[M]  - h[M]  < a raln  g min 


where  a and  SL  are  as  in  (19.14). 
b.  Discussion 

The  upper  bounds  {gn}  are  clearly  nonincreasing.  The  lower 

bounds  (hn } might  decrease  if  an  unfortunate  choice  of  s(*)  is  made. 

n n»  ^ ^ 

If  h <h  , n>n',  then  <P  may  be  substituted  for  4>  , since  it  is 

adapted  to  M°.  Hence,  the  bounds  fhn}  and  {g0}  can  be  made  monotone . 

If  the  family  of  memory  sets  {Mn}  =»  {znHZ+}  is  used,  then  the 

bounds  will  converge  geometrically  as  well.  Computational  experience 


\ 
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indicates  that  convergence  will  occur  more  rapidly  than  predicted  by 

(21.6),  but  that  may  not  be  rapid  enough  to  assure  feasibility,  due  to 

the  fact  that  the  computational  effort  (computer  time  or  memory) 

required  to  solve  the  perceptive  problem  increases  as  n-*-°°.  Since 

computational  effort  is  linearly  related  to  the  number  of  memory 

states,  the  effort  required  to  place  the  bounds  within  e of  each 

1/r 

other  is  proportional  to  e , where  T is  given  by  (14.22). 

A more  favorable  rate  of  convergence  is  obtained  when  the  memory 
sets  are  computed  recursively.  Memory  states  that  are  unlikely  to 
be  recurrent  under  the  optimal  perceptive  policy  can  be  ommited; 
those  which  were  recurrent  during  the  previous  iteration  may  be 
extended  (by  the  addition  in  the  memory  tree  of  branches  from  the 
nodes  to  which  they  correspond). 

Problems  of  decoding  a noisy  Markov  channel  (see  references 
listed  in  Section  4)  are  subrectangular , and  lend  themselves  to  con- 
vergence rate  analysis.  In  most  problems,  however,  there  doesn't 
seem  to  be  much  use  in  computing  the  contraction  indices  a and  p . 
Execution  of  two  or  three  iterations  of  perceptive  dynamic  programming 
yields  more  reliable  indicators  of  convergence  rates. 

c.  Pseudo-perceptive  Dynamic  Programming 

Pseudo-perceptive  dynamic  programming  is  a computational  procedure 
in  which  the  delayed  state  is  guessed  and  substituted  into  the  model 


\ 
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before  optimization  is  performed,  resulting  in  a reduction,  by  a 
2 

factor  of  N , in  the  number  of  augmented  states  considered  during  each 
optimization  step.  The  performance  obtained  will  be  an  approximation 
to  the  optimal  performance:  if  the  delayed  state  is  optimized,  and  not 

merely  guessed,  then  the  performance  obtained  will  be  an  upper  bound 
as  well.  However,  pseudo-perceptive  dynamic  programming  does  not  then 
yield  a lower  bound  to  optimal  feasible  performance. 

d.  Recursive  Computation  of  thfe  MgtHPry  Sets 

Experience  indicates  that  the  choice  of  memory  sets  is  crucial  to 
efficient  performance  of  the  perceptive  dynamic  programming  algorithm. 

For  example,  computation  time  and  storage  requirements  increase 
linearly  with  the  number  of  memory  states;  yet,  certain  memory  states 
can  be  shown  a priori  to  occur  very  rarely  in  the  optimally  controlled 
system. 

Some  recommended  "tricks"  are: 

1)  Do  not  add  branches  to  node  z^  of  the  memory  tree  if,  whenever 

the  memory  state  is  z,  the  optimal  perceptive  decision  does 
not  depend  on  the  delayed-state  component  of  the  augmented  state . 

2)  Do  not  add  branches  to  node  z^  of  the  memory  tree  if  z_  is 

not  recurrent  under  the  optimal  perceptive  strategy 
obtained  during  the  most  recent  iteration. 

3)  Do  not  add  branches  to  node  z of  the  memory  tree  if  all 

entries  of  P(z)  are  small. 
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e.  Minimization  of  Memory  Size  by  Selective  Pseudo-perception 

The  state  guess  §(•)  may  be  selected  according  to  an  ad  hoc 
rule  which  causes  the  feasible  strategy  to  perform  as  well  as  possible 
(e.g.  s * most  likely  state).  It  might  instead  be  selected  so  that 
the  number  of  recurrent  memory  states  under  the  feasible  strategy 
will  be  minimized.  Such  an  approach  assures  that  another  iteration, 
with  a larger  memory  set,  might  be  performed,  although  the  current 
feasible  performance  lower  bound  h[Mn]  will  suffer.  During  the 
final  iteration,  this  approach  to  the  selection  of  s(')  may  reduce 
the  cost  of  implementing  the  solution  obtained. 

f.  Initialization  Procedure 

Suppose  that  a perceptive  solution  has  been  obtained,  and  that, 
from  this,  a feasible  policy  has  been  designed.  The  feasible  policy 
determines  near  optimal  decisions  in  the  steady-state.  It  is  also 
necessary  to  determine  an  initialization  procedure  to  be  followed  by 
the  controller. 

A particularly  simple  way  of  doing  so  is  the  following:  Repre- 

sent the  system  under  the  feasible  strategy  as  a Markov  chain,  and 
determine  the  relative  values  of  all  augmented  states.  Then  solve  a 
finite  horizon  problem,  in  which  the  input  set  includes  the  memory  set 
as  well  as  an  input  representing  a memory  state  indicates  that 

the  feasible  policy  should  be  used  thereafter,  starting  in  the 


\ 
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speclfied  memory  state.  The  value  function  will  be  monotone  increasing, 
in  the  number  of  initialization  steps  allowed. 

If  the  system  under  the  feasible  strategy  is  multiple  chained, 
then  the  finite  horizon  problem  should  be  to  maximize  the  eventual 
gain.  In  the  case  of  an  N-armed  bandit  problem,  the  feasible  (steady- 
state)  policy  is  trivially  computed,  since  the  previous  decision 
determines  the  optimal  present  decision.  The  initialization  pro- 
cedure then  constitutes  an  identification  of  the  correct  hypothesis. 
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22.  A Computational  Algorithm 

In  order  to  assess  the  practicality  of  perceptive  dynamic  pro- 
gramming, a computer  program  was  written  to  solve  general  FPS  control 
problems  with  undiscounted  infinite-horizon  performance  index.  The 
program  is  described  below.  Computational  results,  obtained  using 
this  program,  are  described  in  the  following  section. 

The  source  code,  which  is  written  entirely  in  PL/I,  is  listed  in 
Appendix  C.  It  has  a source  length  of  1250  cards,  and  the  object  code 
occupies  110K  bytes  of  storage  on  the  IBM  370/168. 

The  program  accepts  the  following  data  as  input: 

Title: 

A character  string  of  length  not  exceeding  32,  which 

identifies  the  problem  to  be  solved. 

Problem  dimensions 

N,  the  number  of  internal  states. 

NU,  the  number  of  inputs. 

NY,  the  number  of  outputs. 

NZ,  the  number  of  input-output  pairs. 

FMT,  the  output  format  (1  - "long",  0 - "short"). 

Termination  specifications:  (conditions  under  which  execution 
should  be  terminated) 

MINERR,  the  minimum  value  of  g -h  . 

MAX  M,  the  maximum  number  of  memory  states. 
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MAX  ESSM,  the  maximum  number  of  essential  memory  states. 
MAX_T1ME,  the  maximum  number  of  seconds  to  be  allowed. 
Transition  probabilities: 

Each  matrix  Is  preceeded  by  a list  of  Input-output  pairs 
and  a single  zero  which  marks  the  end  of  that  list;  the 
matrix  is  then  listed  in  row-major  order. 

Expected  incremental  reward  vectors: 

The  vector  a(l),  ...  , q(NU)  are  entered  in  turn. 

Computation  then  proceeds  according  to  the  following  outline: 
Step  1:  Create  a memory  tree  (hereafter  denoted  by  M)  con- 

taining only  the  empty  word  t:;  and  set  ERR  = Q. 

Step  2:  Solve  the  perceptive  problem.  This  is  done  by  damped 

value  iteration  (18.10),  along  with  the  test  for  non- 
optimal  actions  of  Hastings  [1976].  The  optimization  is 
performed  only  on  X[M] , the  connected  class  of  augmented 
states  consisting  of  a delayed  internal  state  along  with 
an  essential  memory  state.  Computation  is  terminated  when, 

after  k.  steps  of  value  iteration,  the  Odoni  bounds  (18.11) 

kl 

are  within  ERR  • (.001) (1.2)  of  each  other. 

Step  3:  Flag  memory  states  that  are  recurrent  under  the 

optimal  perceptive  strategy  (indicated  by  a "G"  in  the 
printout).  For  those  memory  states  only,  determine  the 
feasible  strategy  which  selects  the  input  most  likely  to 


be  optimal. 
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Step  4:  Determine  h°  by  value  iteration  without  optimi- 

zation of  inputs.  Computation  is  terminated  when,  after 

k! 

k„  steps,  the  Odoni  bounds  are  witbin  ERR  • [(.001) (1.2)  ] 

k2 

[(.01) (2)  ] of  each  other. 

Step  5:  Flag  memory  states  that  are  recurrent  under  both 

the  optimal  perceptive  strategy  and  the  feasible  strategy 
for  the  present  iteration  (indicated  by  an  "H"  in  the 
printout) . 

Step  6:  Set  ERR  = (the  upper  Odoni  bound  on  gn}  - {the  lower 

Odoni  bound  on  h°}.  Print  a report  of  the  current  itera- 
tion. If  any  termination  specifications  have  been  met, 
then  stop. 

Step  7 : For  every  triplet  (£,u,y)  satisfying 

(i)  z^  is  an  essential  memory  state  that  was  recurrent 

under  the  most  recent  optimal  perceptive  strategy, 

(ii)  u is  an  optimal  input  for  some  augmented  state  of 

the  form  [i,z]  , 

(iii)  TMU,  (u  ,y)  ) <z(u,y), 

add  to  M the  memory  state  which  contains  the 
M 

£[T  (z,  (u,y))]  + 1 rightmost  input-output  pairs  in  z^  (u,y). 
Also  add  whatever  memory  states  are  required  to  satisfy 
(8.4).  Then  return  to  step  2. 
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Further  details  regarding  execution  procedure  and  methodology, 
may  be  found  in  the  source  code. 

The  output  consists  of  a page  which  lists  the  input  data,  followed 
by  an  iteration  report  for  each  iteration  performed.  The  iteration 
report  heading  contains  the  following  information: 

Line  1 : The  iteration  number,  the  number  of  memory  states, 

the  number  of  essential  memory  states,  the  time  at  which 
preparation  of  the  memory  tree  for  value  iteration  was 
concluded. 

Line  2 : The  upper  and  lower  Odoni  bounds  on  gn,  the  number 

of  value  iteration  steps  performed  and  the  time  at  which 
value  iteration  was  concluded,  in  Step  2. 

Line  3:  The  upper  and  lower  Odoni  bounds  on  h° , the  number 

of  value  iteration  steps  performed  and  the  time  at  which 
value-iteration  was  concluded,  in  Step  4. 

In  the  long  format  , the  iteration  report  heading  is  followed  by  a 
table  in  which  N+l  lines  are  devoted  to  each  essential  memory  state. 

The  column  headings  and  data  listed  are  as  follows: 

RC  Recurrent  state  flags  "G"  and  "H"  are  listed  below.  "G" 

indicates  that  the  memory  state  is  recurrent  under  the 
optimal  perceptive  strategy;  "H"  indicates  that  the 
memory  state  is  also  recurrent  under  the  feasible  strategy. 

I Delayed-state  component  of  the  augmented  state. 
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U Input  selected  by  the  feasible  strategy  (first  line)  , and 

optimal  perceptive  inputs  (following  lines).  An  asterisk 
(*)  indicates  that  the  feasible  strategy  always  picks  the 
optimal  perceptive  input. 

V(G)  Relative  value  for  the  perceptive  problem. 

V(H)  Relative  value  for  the  feasible  problem. 

PROBS  For  memory  state  z,  P(z)  is  listed. 

MEMORY  STATES 

The  memory  stated  nte  listed  below  in  the  form  of  a left- 
handed  tree.  j 

In  the  short  format,  only  the  first  line  of  each  memory  state  table 
is  printed. 


V 


23.  Computational  Results 


a.  The  Machine  Maintenance  and  Repair  Problem 

The  Machine  Maintenance  and  Repair  Problem  was  formulated  in 
Section  3,  and  a procedure,  which  in  principle  leads  to  a solution, 
was  then  described.  That  procedure  is  in  fact  equivalent  to  perceptive 
dynamic  programming  based  on  the  fixed  family  of  increasing  memory 
sets  {Z^n_1^nz+}. 

The  solution  was  actually  obtained  by  perceptive  dynamic  program- 
ming on  the  basis  of  recursively  computed  memory  sets,  as  described  in 
Section  21d.  The  largest  intermediary  Markov  decision  problem  solved 
had  93  states. 

The  steps  that  lead  to  this  solution  are  briefly  described.  Dur- 
ing the  first  six  iterations,  perceived  states  determine  the  optimal 
input,  so  feasible  performance  remains  poor.  Since  pseudo-perception 
initially  takes  the  form  s=l,  input  u=l  ("manufacture")  is  selected  at 
all  times.  On  the  seventh  iteration,  the  input  u=2  ("examine")  is 
selected  whenever  u=l  ("manufacture")  occurred  four  times  previously; 
but  this  is  done  only  for  the  purpose  of  obtaining  a perception  free 
of  delay.  In  iteration  eight,  the  memory  set  is  augmented  by  branches 
corresponding  to  input  u=2  ("examine") ; that  input  is  no  longer 
selected  and  feasible  performance  increases  for  the  first  time.  A 
similar  pattern  continues  until  sufficient  memory  has  been  allocated  to 
realize  the  optimal  strategy,  and  to  eliminate  suboptimal  decisions 


motivated  by  perceptive  information  structure. 


Note  that  this  problem  is  not  detectable.  Indeed,  there  are  two 
possible  decompositions  into  detectable  parts:  if  the  machine  is 

never  repaired,  then  there  is  only  one  recurrent  state  and  the  system 
Is  trivially  detectable;  if  the  machine  is  repaired,  then  all  infor- 
mation previous  to  the  repair  is  dispensable;  in  either  case  a=0  . 

The  rate  of  convergence  of  perceptive  dynamic  programming  is  deter- 
mined by  the  rate  of  absorption  of  transient  states  in  the  former  case 
which  is  ac  = .99,  = 2 (very  unfavorable).  The  convergence  rate 

for  memory  sets  used  in  section  3 is  bounded  by: 


n 

g 


-h 


n 


< (.99) 


1 


3.4025 
- .99  . 


The  actual  convergence  obtained  was,  of  course,  considerably  more 
rapid. 

The  input  deck  for  this  problem  took  the  form: 


//  EXEC  PLIXG,PROG= 'U.M13014 . P10015 . PLATZSYS . LOAD(LDMOD) ' 
//G.SYSIN  DD  *,DCB=BLKSIZE=2000 

'MACHINE  MAINTENANCE  & REPAIR',  3, 4, 3, 4,1,  20,-01,100,100, 

1,1,0, 

.81, .18, .01,  0, .9, .1,  0,0,1, 

2,2,0, 

.81, .09, .0025,  0, .45, .025,  0,0, .25, 

2.3.0, 

0, .09, .0025,  0, .45, .075,  0,0, .75 

3. 1.4. 1.0, 

.9025!! 475!.  25,’  .6525,-225,0,  -.5, -1.5, -2. 5,  -2, -2, -2, 

/*E0J 


The  computer-generated  report  is  given  on  the  next  29  pages. 
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machine  maintenance  l repair 

^ STATES  <.  INPUTS 
TIME  LIMIT:  20.00 
MAX  MEM:  ICC 


PROBLEM 

1 OUTPUTS  4 I/O  PAIRS 
MIN  ERR:  0.010 
MAX  ESS  mem:  100 


TRANSITION  PROBABILITIES: 
Z (U,  Y)  p 


1 1 1 


2 2 2 


C . 8 1 C 0 
C .0000 
0.0000 


C. R100 
C .0000 
C.0000 


C .0000 

c.cooo 

C.0000 


1 .cooo 

l.COOO 

l.OOCO 


o. leoo 

O.OCOO 

0.0000 


0.0000 

0.0500 

0.0000 


c. osco 

0.4501 

o.ocoo 


o.ocoo 

o.ocoo 

o.ocoo 


n.cico 

0.  1CC0 

1. COOO 


Q.CC25 

0.C261 

0.2500 


0 . C 0 7 5 
0.C750 
0.7500 


C.CCOO 

o.coco 

O.COOG 


INCREMENTAL  REWARDS: 

u c 


1 

2 

0 

4 


C.9025  0.4750  0.2500 

C.6525  0.225P  C.CCOO 

- C . 5000  -1.5CC0  -2.5000 
-2.0000  -2.0000  -l.COOO 


SPECS 


MACHINE  MAINTENANCE  C REPAIR 


PAGE  2 


TABLE  1.01 


1 

ITERATION 

1 

MEM  = 

1 ESS  MEM  = 

1 

TIME  = 

0. 16 

1 

1 

0.A99  < G 

< 

0.5  n 

17  STEPS 

TIME  = 

0.25 

1 

0.250  < H 

< 

C.AAA 

9 STEPS 

TIME  = 

0.26 

RC 

I 

U 

V ( G ) 

VIH) 

PRCRS 

MEMORY  STATES 

GH 

1 

<E> 

1 

1 

2.61 

2.76 

1.0000 

0.0000 

0.0000 

2 

3 

0.50 

0. 36 

c. coco 

1.0000 

0.0000 

3 

A 

O.OP 

-0.75 

0. ccoo 

o.oocc 

l.COCC 
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♦ 

I  ITERATION  2 MEM  = 3 ESS  MEM  = 3 TIME  = 0.36  I 

I I 

I 0.999  < G < 0.995  8 STEPS  TIME  = 0.96  | 

I 0.250  < H < C.  395  1 A STEPS  TIME  = 0.50  I 

> 4. 

RC  I U V ( G ) V ( H ) PRC3S  MEMORY  STATES 

1 <E> 

1 1 2.98  l.COCO  O.OOCO  O.COCC 

2 3 0.98  O.COOO  l.OOOC  O.OCCO 

3 9 -0.02  O.COOO  0.0000  l.COOO 

GH  1 1 

1 1 2.07  2.63  0.8100  0.1  ROC  C.01CC 

2 3 0.3°  0.93  O.COOO  0.9000  0.1000 

3 9 -0.02  -O.BC  O.COOO  O.OOOC  l.COCO 

G 1 * 9 

1 1 2.98  l.COOO  O.OOCC  O.CCCO 

2 1 2.98  l.COOO  O.OOOC  0.^000 

3 1 2.98  l.COCO  C.COCC  0.0000 
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♦ - 

1 

1 

ITERATION  3 

MTM  = 5 

ESS  M E M = 5 

TIME  = 

1 

1 o 
1 

1 • 
1 o 

♦ 

I 

| 

1 

1 

C.477  < G < 

C .478 

10  STEPS 

TIME  = 

0.8  1 

1 

1 

1 

0.250  < H < 

C.  3R5 

S Tc  PS 

TIME  = 

C.  Rq 

1 

4- ♦ 


RC 

I 

U 

V ( G ) 

V ( H ) 

PRCPS 

MEMORY  STATES 

1 

<E> 

1 

1 

2.44 

1.C000 

0 . u C 0 0 

C.COOO 

2 

3 

0.46 

0. COCO 

1.C0CC 

c.rccc 

3 

4 

o 

• 

0 

1 

O.COOO 

o.oocc 

l.OCCO 

1 

1 

1 

1 

2.01 

0. 8 ICO 

0. 1 R OC 

C.OICO 

2 

3 

0.  36 

0. COCO 

0.5GCC 

0. 10C0 

3 

4 

-0.04 

o.cooo 

0.0000 

1 .coco 

GH 

1 

1 

1 

1 

1.67 

2.27 

0 . 6 r'  4 1 

0. 307e 

C • r 3 6 l 

2 

3 

0.27 

0.35 

O.COOO 

0.8100 

0.1500 

3 

4 

-C.C4 

-0.65 

c. coco 

o.cooo 

1 .0000 

r 

o 

1* 

4 

1 

1 

2.01 

0.8100 

0. 180C 

0 . 0 1 C 0 

2 

1 

2.01 

0.8100 

0.1800 

0.0100 

3 

1 

2.01 

0. 81C0 

0. 1 ROC 

0.01C0 

G 

1* 

4 

1 

1 

2.44 

l.COOO 

O.OOCC 

c.coco 

2 

1 

2.44 

1.C000 

O.OOOC 

C.COOO 

3 

1 

2.44 

l.COOO 

O.OOCC 

C.0000 
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TABLE  4.01 


I TERATION 

4 

MEM  = 

7 ESS  MEM  = 7 

TIME  = 

0.99 

— 4 

1 

I 

0.462  < C 

< 

C .464 

12  STEPS 

TIME  = 

1.  TO 

1 

1 

0.250  < H 

< 

0.392 

13  STEPS 

TIMF  = 

1.43 

1 

-4 

RC 

I 

U 

V ( G ) 

V ( H I 

PRORS 

MEMORY  STATES 

1 

<E> 

l 

1 

2.41 

l.COOO 

o.oooc 

0.0000 

2 

3 

0.45 

C.COOO 

1 .occc 

c.ccco 

3 

4 - 

0.05 

0.0000 

o.oocc 

l.CCCO 

1 

1 

1 

1 

1.07 

0.9100 

0. 18CC 

0.0100 

2 

3 

0.35 

O.CCCO 

0.9000 

0. 10C0 

3 

4 - 

0.05 

o.cooo 

OlOOOO 

1 .0000 

1 

1 

1 

1 

1.62 

0.6561 

0.307E 

C.0361 

2 

3 

0.2  6 

0.0000 

0.9100 

0.1 900 

3 

4 - 

0.05 

0 . C C 0 0 

o.oocc 

1 .0000 

GH 

1 

1 

1 

1 

1.33 

1.92 

0.5314 

0. 3951 

0.0734 

2 

3 

0.19 

0.  34 

O.COOO 

0. 7290 

0.2710 

3 

4 - 

0.05 

-0.59 

0. CCOO 

O.OOOC 

l.COOO 

G 

1* 

4 

1 

l 

1 .62 

0.6561 

0. 3079 

0.0361 

2 

1 

1.62 

0.6561 

0.3079 

0.0361 

3 

1 

1.62 

0.6561 

0. 3079 

0.0361 

G 

1* 

9 

1 

1 

1.97 

0. 9 100 

0.1900 

0.0  ICO 

2 

1 

1.97 

0.9100 

0.1800 

0.0100 

3 

1 

1.97 

C. 91CC 

0. 1BCC 

0.0100 

r 

1 * 

4 

l 

1 

2.41 

1 • CuOO 

O.OOCC 

O.CCCO 

c 

1 

2.41 

1 . cooo 

O.00C0 

o.cooo 

3 

1 

2.41 

1. CCCO 

O.OOOC 

o.cooo 
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1 

1 

I TERAT I ON 

5 

MEM  = 

9 ESS  MEM  = 9 

TIME  = 

1.55 

1 

1 

0.449  < G 

< 

C . 4 52 

14  STEPS 

time  = 

2.01 

1 

0.250  < H 

< 

0.374 

13  STEPS 

TIME  = 

2.18 

I 

U 

V ( G ) 

V ( H ) 

PRCRS 

MEMORY  STATES 

1 

<F> 

1 

1 

2.39 

l.COOO 

0.0000 

O.COOO 

2 

3 

0.44 

O.CCOO 

l.OOCC 

O.CCCO 

3 

4 - 

0.06 

O.COOO 

0.0000 

1 .COCO 

1 

l 

1 

1 

1.93 

0. 810C 

0. 1800 

0.0100 

2 

3 

0.34 

O.CCCO 

0.9000 

0.1000 

3 

4 - 

0.06 

0. oooo 

0.0000 

l.COOO 

1 

l 

1 

1 

1.57 

0. 6561 

0.307e 

C.0361 

2 

3 

0.25 

O.COOO 

O.81C0 

0.1900 

3 

4 - 

0.06 

O.COOO 

C.OOCC 

1 .0000 

1 

1 

1 

1 

1.27 

0.5314 

0.3951 

0.C734 

2 

3 

0.17 

O.COOO 

0.7290 

0.2710 

3 

4 - 

0.06 

O.COOO 

O.COOO 

1 .cooo 

* 

1 

l 

1 

1 

1 .04 

1.65 

0.4305 

0.4513 

o . 1 1 e 3 

2 

3 

0.09 

0.31 

O.COOO 

0.6561 

0.3439 

3 

4 - 

0.06 

-0.49 

O.COOO 

O.COCO 

l .COCO 

1* 

4 

1 

1 

1.27 

0.5314 

0.3951 

0.C734 

2 

1 

1.27 

0.5314 

0.3951 

0.0734 

3 

1 

1.27 

0.5314 

0.3951 

0.0734 

1* 

4 

1 

1 

1.57 

0.6561 

0.3078 

0.C361 

2 

1 

1.57 

0.6561 

0.3078 

0.0361 

3 

1 

1.57 

C. 6561 

0.3078 

0.0361 

1* 

4 

1 

1 

1.93 

0.8100 

0. 180C 

0.0100 

2 

1 

1.93 

0.8100 

0.1800 

0.0100 

3 

1 

1.93 

0.8100 

C.  18CC 

C.CIOO 

\ 

V 
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TARLE 


MACHINE 

M4 intenance 

L REPAIR 

G 

1* 

1 

1 2.39 

l.COOO 

o.ooco 

2 

1 2.39 

l.CCOO 

o.cocc 

3 

1 2.39 

1.  coco 

o.oocc 

O.COOO 

O.OOOC 

C.CCCC 
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q 

TABLE  6.01 

I ITERATION 

1 

6 

MEM  = 11 

ESS  MEM  = 11 

TIME 

= 2.30  1 

i 

1 0.438  < G 

< 

0 . A A 1 

16 

STEPS 

TIME 

| 

2.08  | 

1 0.250  < H 

< 

C.  172 

12 

STCPS 

time 

= 3.23  1 

I 

U 

VIG) 

V ( H ) 

PRC8S 

MEMORY  STATES 

1 

<E> 

1 

1 

2.37 

l.CCCO 

0.0 OCC 

C.CCCC 

2 

3 

0 . A 3 

O.COOO 

1.0000 

O.COOO 

3 

A 

-0.07 

O.CCOC 

o.occo 

1 .0000 

1 

l 

1 

1 

1.90 

0.8100 

0. 1 3CC 

C.01C0 

2 

3 

0.33 

O.COOO 

0 . 9000 

0. 1C00 

3 

A 

-0.07 

C. CO CO 

o.cocc 

i.COCO 

1 

1 

l 

I 

1.52 

0.6861 

0. 3078 

0.0361 

2 

3 

0.2A 

0. CGGO 

0.P1CC 

C. 19CC 

3 

A 

-0.07 

O.COOO 

o.oocc 

l .CCCC 

1 

- 

1 

1 

1 

1.22 

0.5316 

0. 3951 

C.073A 

2 

3 

0.16 

0. CCOO 

0. 729C 

C.271C 

3 

A 

-0.07 

O.COOO 

O.OOCC 

l .CC'.O 

1 

l 

l 

1 

0.97 

0 . A 3 05 

0.A513 

0.1183 

2 

3 

0.08 

C. COCO 

0.6561 

C. 3A  39 

3 

A 

-0.07 

O.COOO 

O.OOCC 

1 .0000 

l 

1 

1 

1 

0.78 

1 . 38 

0.  3 A 8 7 

0. A836 

0. 1677 

2 

3 

0.02 

0.26 

O.COOO 

0.6905 

C. A095 

3 

A 

-0.07 

-O.AC 

O.COOO 

O.OOCC 

I.COCO 

1* 

A 

1 

1 

0.97 

0. A3C5 

0 . A 5 1 3 

0.1183 

2 

1 

0.97 

0. A 306 

0 • A 5 l 3 

o. 1 1 e 3 

3 

1 

0.97 

0 . A 305 

0.A513 

0.1183 

I* 

A 

l 

1 

1.22 

0 . 5 3 1 A 

0.3951 

0.073A 

2 

1 

1.22 

0. 531A 

0.3951 

C.073A 

3 

1 

1.22 

0. 531A 

9. 3951 

C . C 7 3 A 
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TARLE 

6.0? 

G 

1* 

4 

1 l 

1 

1 

1.52 

C.  6561 

0. 3078 

C.0361 

2 

1 

1.52 

0.6561 

0.3078 

C.0361 

3 

1 

1.52 

0.6561 

0.307C 

0.0361 

G 

1* 

4 

1 

1 

1.90 

0. 81C0 

0. 18CC 

C.C1CC 

2 

1 

1.90 

0.8100 

0.18CC 

0.01C0 

3 

1 

1.90 

0.8100 

0.1800 

0 . o l CO 

G 

l* 

4 

1 

1 

2.37 

1 .COCO 

O.OOCC 

C.CCCO 

2 

1 

2.37 

1 .COOO 

O.OOCO 

o.coco 

3 

1 

2.37 

l.COOO 

O.OOOC 

0.0000 

\ 
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TA«LE  7.01 


I TERATICN 


MEM 


13 


ESS  ML  M 


13 


0. 

A 3 7 

< G < 

C. 

A 3 9 

13  S T F ° S 

0. 

195 

< H < 

0. 

369 

1A  STEPS 

I 

U 

VIC) 

V ( H ) 

PRC3S 

1 

1 

1 

2.36 

2 . 9 A 

l.COOO 

0.0000 

O.COOO 

2 

3 

0.  A3 

0.  A3 

O.COCO 

1 .OOOC 

O.COOO 

3 

A 

-0.07 

-0.91 

0. CCOO 

0. oocc 

l.CCCC 

1 

1 

1 

1.90 

2.  AC 

0. 8100 

0.1800 

0.0100 

2 

3 

0.33 

0.2A 

O.COOO 

0.9CCC 

0 . 1 0 C 0 

3 

A 

-0.07 

-0.96 

0. cnoo 

0.0000 

l.COCO 

1 

1 

1 

1.52 

1.93 

0.6561 

0.3078 

0.0361 

2 

3 

0.2A 

0.C7 

O.COOO 

0.81CC 

0. 1 9CC 

3 

A 

-0.07 

-I  .Cl 

O.COOO 

0.0000 

l.COOO 

1 

1 

1 

1.21 

1.51 

0. 531A 

0.3951 

0 . 0 7 3 A 

2 

3 

0.16 

-0.09 

O.COOO 

0.729C 

0.27  1C 

3 

A 

-0.07 

-1 . 06 

0.0000 

0.0000 

L.O000 

1 

2 

2 

0.96 

1 . 15 

0. A305 

0. A513 

0. 1 1P3 

2 

3 

0.08 

-0.2A 

O.COOO 

0.6561 

C.  'A39 

3 

A 

-0.07 

-1.11 

O.COOO 

0.0000 

l.COOO 

1 

2 

2 

0.76 

0.  3AR7 

0. AR  36 

C. 1 677 

2 

3 

0.02 

O.COOO 

0.5905 

0. A095 

3 

A 

-0.07 

O.COOO 

0.0000 

l.COOO 

1 

2 

2 

0.59 

0. 282A 

0.A98C 

C.2L95 

2 

3 

-O.OA 

O.COOO 

0.531A 

0 . A 6 8 6 

3 

A 

-0.07 

O.COOO 

0.0000 

1.0000 

2* 

1 

2 

0. 76 

0.  3A°7 

0. A836 

0.1677 

2 

2 

0.76 

0. 3A67 

0 . A 8 36 

0.1677 

3 

2 

0.76 

0. 3A87 

0.A836 

0.1677 

TIME  = 

TIMC  = 
T ! Mr  = 


3.77 

9.08 
A.  30 


RC 

GH 


OH 


GH 


GH 


MEMORY  STATES 
<E> 


\ 
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table 

7.02 

G 

2* 

6 1 1 

1 1 

1 

2 

0.96 

0.6105 

0.6511 

0.1181 

2 

2 

0.96 

0.6305 

0.6513 

0.1183 

3 

2 

0.96 

C. 6 105 

0.65  13 

0.1183 

G 

1* 

6 

1 

1 

1.21 

0.5116 

0.3951 

0.0716 

2 

1 

1.21 

0.5316 

0. 3<V5  1 

0.0736 

3 

l 

1.21 

0. 5 116 

0.1951 

0.0716 

G 

1* 

6 

1 

1 

1.52 

0.6561 

0.3078 

0.0  16  l 

2 

i 

1.52 

0.6561 

O.3073 

0.0361 

3 

1 

1.52 

0.6561 

0. 107R 

0.C361 

G 

1* 

6 

1 

l 

1.90 

o.eioo 

0. 18CC 

0.01CC 

2 

1 

1.90 

0.3100 

0. 130C 

0.0100 

3 

1 

1.90 

0.  RICO 

0. tROO 

0.0100 

G 

1* 

6 

1 

i 

2.  16 

l.COOO 

0.0000 

c.occo 

2 

1 

2.  36 

1.0000 

0.0000 

0.0000 

3 

1 

2.16 

1 . ccoo 

o.oocc 

o.cooc 

MACHINE  MAINTENANCE  C REPAID 
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TABLE  8.01 


I TERAT I ON 

8 

MEM  = 15 

ESS  MEM  = 14 

TIME  = 

4.5  3 

0.432  < G 

< 

C.435 

14  STEPS 

TIME  = 

5.  36 

0.308  < H 

< 

0.400 

12  STFPS 

TIME  = 

5.72 

♦ 4 


RC 

I 

U 

V ( G ) 

V ( H ) 

PRCB£ 

MEMORY  STATE 

GH 

l 

1 

1 

1.76 

1.85 

0.8100 

0. 18CC 

O.C ICO 

2 

3 

0.19 

-0.  13 

0. CCOO 

0.9CCC 

C.  1CC0 

3 

4 

-0.21 

-0.55 

O.COOO 

O.OOCO 

l.COCO 

GH 

l 

1 

1 

1 

1 .37 

1.47 

C . 6 5 6 1 

0.3070 

0.0361 

2 

3 

0.10 

-0.22 

O.COOO 

0.81CC 

c.  ncc 

3 

4 

-0.21 

-0.88 

O.COOO 

0.0000 

1.0000 

GH 

1 

1 

1 

1 

1.06 

l . 1C 

0.5314 

0. 3951 

C.0734 

2 

3 

0.02 

-0.28 

O.C  TOO 

0.729C 

0.2710 

3 

4 

-0.21 

-0.76 

O.COOO 

0.0000 

l.CGOO 

GH 

2 

1 

1 

2 

0.81 

0.  PC 

0.4305 

0.4513 

0.1183 

2 

3 

-0.05 

-0.3  3 

O.COOO 

0.6561 

0. 3439 

3 

4 

-0.21 

-0.6’ 

O.COOO 

O.OOOG 

1 .0000 

2 

1 

1 

2 

0.61 

0. 3487 

0.4036 

C. 1677 

2 

3 

-0.12 

O.COOO 

0.5905 

C .4055 

3 

4 

-0.21 

O.COOO 

0.0000 

1 .COCO 

2 

1 

1 

2 

0 . 4 A 

0.2824 

0.498C 

C . .3 1 9 5 

2 

3 

-0.18 

O.COOO 

0.5314 

C . 4 6 8 6 

3 

4 

-0.21 

O.COOO 

0.0000 

1 .CCOO 

2* 

4 

1 

2 

0.61 

0.3487 

0.4836 

0.1677 

2 

2 

0.61 

0.3487 

0.4836 

C. 1677 

3 

2 

0.61 

0.3487 

0.4836 

0.1677 

GH 

2* 

4 

1 

2 

0.81 

0.8C 

0.4  305 

0.4513 

0.1103 

2 

2 

0.81 

0.80 

0.4305 

0.4513 

0.1183 

3 

2 

0.81 

O 

ac 

• 

o 

0.4305 

0.4513 

0.1183 
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TARL  E 

8.02 

GH 

I * 

4 1 

1 1 

1 

1 

1.06 

l . 10 

0.5314 

0.3551 

0 .0734 

2 

1 

1.06 

1 . 1C 

0.5314 

0.3551 

0.C734 

3 

1 

1.06 

1 . 1C 

0.5314 

0.3551 

0.C734 

GH 

l * 

4 

1 

1 

1 .37 

1.47 

0.6561 

0.3078 

C.C36  1 

2 

l 

1.37 

1.47 

C . 6 5 6 l 

0.3078 

0.0361 

3 

1 

1.37 

1.47 

C. 6561 

0.3078 

0.0361 

GH 

1* 

4 

1 

1 

1.76 

i . eo 

o. e 100 

0. 13CC 

0.01C0 

2 

1 

1.76 

1 . 85 

0.8100 

0. 18 CO 

0.0100 

3 

1 

1.76 

1 . 85 

0. 8100 

C.  1 «CC 

o.oico 

GH 

1 

2 

1 

1 

2.02 

2.  lfc 

0. C100 

0.05CC 

0.0025 

2 

3 

0.24 

-0.06 

o.cooo 

0.45CC 

0.0250 

3 

4 - 

0.21 

-1.06 

O.CCCO 

O.OOCC 

0.250C 

GH 

4 

3 

1 

3 

0.22 

0.01 

c.cooo 

0.05CC 

C.C075 

2 

3 

0.  15 

O.Cl 

O.COOO 

0.45CC 

0.0750 

3 

4 - 

0.21 

O.Cl 

c. coco 

C.COCC 

C. 75CC 

GH 

1* 

4 

1 

L 

2.23 

2.41 

1 . cooo 

O.OOCC 

o.occo 

2 

1 

2.2  3 

2.41 

l.CCCO 

o.cccc 

0.0000 

3 

1 

2.23 

2.41 

l.CCOC 

O.OOCC 

C.CCCC 

\ 
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TABLE  9.01 


♦ 


■f 


I ITERATION 
I 

I 0.429  < G 

I 0.284  < H 


RC  I U V ( G ) 

G 1 

1 1 1.75 

2 3 0.19 

3 4 -0.21 

GH  1 

1 1 1.36 

2 3 0.10 

3 4-0.21 

GH  1 

1 1 1.05 

2 3 0.02 

3 4-0.21 

GH  1 

1 1 0.79 

2 3 -0.05 

3 4 -0.21 

GH  2 

1 2 0.59 

2 3-0.12 

3 4-0.21 

1 

1 1 0.4< 

2 3-0.18 

3 4-0.21 

GH  2* 

1 2 0.59 

2 2 0.59 

3 2 0.59 

GH  1* 

1 1 0.79 

2 1 0.79 

3 1 0.79 


9 

mem  = 18 

0. 

431 

C. 

404 

V ( H ) 

PRCRS 

0. 8100 

o.cooo 

o.cooo 

-0.84 

C . 6 5 6 1 

-2 . 56 

O.COOO 

-3.33 

o.cooo 

-1.19 

0.5314 

-2.62 

O.COOO 

-3.22 

c.ccoo 

-1.49 

0. 4 305 

-2.66 

0.  c .300 

-3.09 

O.COOO 

-1.74 

0.  *487 

-2.69 

O.COOO 

-2.97 

O.COOO 

0.2324 

O.COOO 

O.COOO 

-1.74 

0. 3487 

-1.74 

0. 34R7 

-1.  74 

0.  3487 

-1  .49 

0.4  305 

-1.49 

0.4305 

-1.49 

0.4  305 

ESS  MFM  = 17 

16  STEPS 
13  STEPS 


0. 1RCC 
0.9000 
0.0000 

0.0  ICC 
0. 1CCC 
1 .OOCO 

0.3078 

O.RIOC 

O.OOOC 

C.0361 

0.1900 

1.0000 

0.3951 

0.7290 

O.COOO 

0.0734 
0.2710 
1 .OOCO 

0.451? 

0.6561 

o.nocc 

0.1183 

0. 3439 

1 . COOO 

0.4836 

0.5905 

O.OOOC 

0.1677 
C . 4095 
1 .0000 

0.49GC 

0.5314 

o.ooco 

0.2195 
0.4686 
1 .0000 

0.43  36 
0.4836 
0.4836 

0.1677 

0.1677 

0.1677 

0.4513 

0.4513 

0.4513 

C.  1 183 
0.1183 
C.  1 1 83 

TIME  = 5.96 

TIME  = 7.11 

TIME  = 7.73 


MEMORY  STATES 
1 


1 


1 


1 


1 


l 


4 


4 
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TABLE 

9 

.02 

GH 

1* 

A 1 

1 

1 

1 

1 

1.05 

-1.19 

0. 5314 

0.3951 

0 . 0 7 3 A 

2 

I 

1.05 

-1.  19 

0. 531A 

0.3951 

0.C73A 

3 

I 

1 .05 

-1.19 

0.5314 

0.3951 

0.C73A 

GH 

I * 

A 

1 

1 

1.36 

-0. 8A 

0.6561 

0.3078 

0.0361 

2 

1 

1.36 

-0. 8A 

0.6561 

0.3078 

C.0361 

3 

I 

1.36 

-0. 8A 

C.  6561 

0.3078 

C.0361 

GH 

I 

2 

1 

I 

1.58 

-0.62 

0.6561 

0.2268 

0.0196 

2 

3 

0. 1A 

-2.52 

0. cooo 

0.A050 

0.0700 

3 

4 - 

0.21 

-3.  A3 

0. cron 

o.cocc 

C.25C0 

GH 

I* 

A 

1 

1 

1.75 

-0.  A1 

0.8100 

0. 180C 

0.C100 

2 

1 

1.75 

-0.  Al 

0.8100 

0. 18C0 

0.0100 

3 

1 

1.75 

-0.  A 1 

0. PICO 

0. 1 8CC 

C.C1CC 

1 

2 

1 

I 

2.01 

0.8100 

0.0900 

0.0025 

2 

3 

0.2A 

0. CCCO 

0.45C0 

0.0250 

3 

4 - 

0.21 

0. coco 

O.OOCC 

C.25CC 

GH 

* 

1 

1 

1 

I 

1.81 

-0.  AC 

0.6561 

0.1539 

0.0090 

2 

3 

0.19 

-2.A6 

0.  cooo 

0. A05C 

C . 04  7 5 

3 

A - 

0.21 

- 3 . A 8 

o.cooo 

O.OOCC 

0 . 2500 

A 

3 

1 

3 

0.21 

o.ccoo 

0.09CC 

0 . CO  7 5 

2 

3 

0.15 

o.cooo 

0.A5CC 

C . 0750 

3 

A - 

0.21 

o.cooo 

0.0000 

0.7500 

GH 

A 

l 

1 

1 

0. 1A 

-2.3C 

o.cooo 

0.1535 

C.027  l 

2 

1 

0.07 

-2.3C 

o.cooo 

0 . AO  50 

0. 1A25 

3 

A - 

0.21 

-2.  30 

o. cooo 

O.OOCC 

0.78CC 

GH 

I * 

A 

1 

I 

2.22 

0.  to 

l.COOO 

O.OOOC 

C.CCCO 

2 

1 

2.22 

0.  1C 

l.COOO 

0.0000 

O.OOOC 

3 

1 

2.22 

0.  10 

1 .CCCO 

O.OOOC 

O.CCOO 

\ 
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TABLE  10.01 

1 f TERATICN 

1 

10 

MEM  = 2 3 

ESS  M E M = 21 

TIME 

= P.06  I 

1 

1 

1 0.430  < G 

< 

C.4  31 

12 

STEPS 

TIME 

l 

= 9.12  1 

1 0.250  < H 

< 

C.365 

14 

STFDS 

T I MF 

= 9.27  I 

4. 

RC 

I 

U 

V I G ) 

V ( H ) PRCPS 

MEMORY  STATES 

1 

1 1 

l 

1 

-0.35 

0.6561 

0.3076 

0.0361 

2 

3 

-l.ftO 

0. ccco 

0. 81CC 

0. 19CC 

3 

4 

-1.91 

O.C JOO 

O.OOOC 

1 .rCCC 

1 

l 

1 

1 

-0.66 

0.5314 

0.3951 

0.0734 

2 

3 

-1.69 

c. ccro 

0.  72  5 C 

0 . 2 7 1 C 

3 

4 

-1.91 

0. COOO 

0.0000 

1 .COCO 

1 

l 

1 

1 

-0.92 

0.4  305 

0.4513 

0.1183 

2 

3 

-1  . 7ft 

o.ccco 

0.6561 

C.  34  38 

3 

4 

-1.91 

o.cooo 

o.oocc 

1 .cccc 

1 

1 

1 

l 

-1.12 

0 . 3 4 ° 7 

0.40  36 

0.1677 

4. 

3 

-1.82 

o.ccco 

0. 59C5 

C . 4095 

3 

A 

-1.91 

o.ccco 

o.oocc 

1 .PCCC 

e,h 

1 

♦ ^ 

1 

1 

-1.27 

1.25  0.2024 

0.498C 

0.2195 

2 

3 

-1.03 

0.26  O.CCCO 

0.5314 

C . 4 6 8 6 

3 

A 

-1.91 

-0.37  O.CCCO 

O.OOCC 

1 .COCO 

o 

l* 

4 

1 

1 

-1.12 

0.  3407 

0.4836 

0.1677 

2 

1 

-1.12 

0. 3487 

0.4836 

C. 1677 

3 

1 

-1.12 

0. 3487 

0.48  36 

C. 1677 

VI 

1* 

4 

1 

1 

-0.9? 

0.4  30 r’ 

0.4513 

0.1183 

2 

1 

-0.92 

0.4  305 

0.4513 

c . 1 1 e 3 

3 

1 

-0.92 

o 

• 

X 

o 

0.4513 

C. 1 163 

G 

1* 

4 

l 

1 

-0.6ft 

0.5  314 

0.3951 

0.0734 

2 

l 

- 0 . ft  6 

0.5  3 14 

0. 3951 

C.C734 

3 

1 

-0.66 

0.5314 

0.3951 

C.0  7 34 

\ 

v 
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1 

1 

1 

-0.48 

0. 5 314 

0. 3222 

0.048R 

2 

3 

-1.65 

o.cooo 

0.3645 

0. 1 1C5 

3 

4 

-1.91 

o.cooe 

0.0 oco 

0.2500 

G 

1* 

1 

1 

-0.35 

0.6561 

0.3078 

C.C361 

2 

1 

-0.35 

0.6561 

0.3078 

C.0361 

3 

1 

-0.35 

0.6561 

0.3078 

0.0361 

1 

1 

1 

-0.13 

C . 6 5 6 1 

0.2268 

C . 01 96 

2 

3 

-1.56 

O.COOO 

0.4050 

0.O7C0 

3 

4 

-1.91 

O.COOO 

0.0000 

0.2500 

1 

1 

1 

-0.30 

0.5  3 14 

C.2566 

0.03  1C 

2 

3 

-1.61 

O.COOO 

0.3645 

0.088C 

3 

4 

-1.91 

O.COOO 

0.0000 

0. ?500 

4 

1 

3 

-1.53 

O.COOO 

0.03  1C 

C . 0 1 6 5 

2 

3 

-1.64 

O.COOO 

0.405C 

0. 1 200 

3 

4 

-1.91 

o.cooo 

0.0000 

0. 7500 

G 

1* 

1 

1 

0.04 

0.P100 

0. 1 RCC 

0 . 0 1 C C 

2 

1 

0.04 

0.3100 

0. 18CC 

C.01CC 

3 

1 

0.04 

o. e ioo 

0.1 RCC 

o.nico 

1 

1 

1 

0.29 

0.  g ICO 

0.09CC 

0.0025 

2 

3 

-1.47 

o.cooo 

0.4500 

C . 02  50 

3 

4 

-1.91 

o.cooo 

O.OOCC 

0.2500 

1 

1 

l 

0.09 

0.6561 

0.1539 

0.009C 

2 

3 

-1.52 

o.cooo 

0.4050 

0.0475 

3 

4 

-1.91 

c. ccoo 

C.OCCC 

0.2500 

1 

1 

1 

-0.10 

0.5314 

0 . 1976 

C.0184 

2 

3 

-1.57 

o.cooo 

0.3645 

0.0677 

3 

4 

-1.91 

o.cooo 

O.OOCC 

0.2500 

4 

1 

3 

-1.49 

o.cooo 

0.09CC 

C.0075 

2 

3 

-1.56 

o.cooo 

0.4500 

0.0750 

3 

4 

-1.91 

c. coco 

O.OOCC 

0.75 CO 

TARLE  10.02 
2 1 1 


4 


2 


1 


1 


4 


2 


1 


L 


1 


V 


1 
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MACHINE 

MAINTENANCE 

C REPAIR 

PACE  1R 

1 

4 

3 - 

1.56 

O.COOO 

0.1539 

eg 

C 

• 

c 

2 

3 - 

1.67 

C. COCO 

0.405C 

0.1425 

3 

4 - 

1.91 

C. COCO 

0. cccc 

C. 75CC 

l 

4 

3 - 

1.63 

O.COOO 

0. 1976 

0.0551 

2 

1 - 

1.75 

c. cooo 

0.3645 

0.2C32 

3 

4 - 

1.91 

c. cccc 

0. cocc 

C. 76CC 

r- 

1 

1 * 
1 

0.52 

1 .cooo 

O.000C 

c.cocc 

2 

l 

0.52 

1. ccoo 

O.OCOC 

O.COOO 

3 

1 

0.52 

1. coco 

O.OOCC 

C.CCCC 

T4HIE  10 
l 


1 
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machinf  maintenance  c repair 


TABLE  11.01 


ITERATION  II 


ESS  MEM  = 23 


9.42  I 


0. 

423 

< G < 

C. 

427 

18  STEPS 

0. 

259 

< H < 

C. 

4 1 C 

12  STEPS 

I 

U 

V ( G ) 

V ( H » 

PRCRS 

1 

1 

1 - 

0.36 

C. 6561 

0.3078 

C.C361 

2 

3 - 

1.60 

0.0000 

0.81CC 

0. 19C0 

3 

4 - 

1.91 

O.COOO 

0.0000 

l .CCOO 

1 

1 

1 - 

0.63  - 

l . 15 

0.5314 

0. 3951 

C . 07  34 

2 

3 - 

1.69  - 

2.61 

O.COOO 

0.729C 

0.2710 

3 

4 - 

1.91  - 

3.2C 

O.COOO 

0.0000 

1 .COCO 

1 

1 

1 - 

0.94  - 

1 .4P 

C.  4 305 

0.4513 

c . 1 1 e 3 

2 

3 - 

1.76  - 

2.65 

O.COOO 

0.6561 

C. 3439 

3 

4 - 

1.91  - 

3.C3 

O.COOO 

O.OOOC 

1 .C°OG 

1 

2 

2 - 

1.15  - 

1.73 

0. 3487 

0.4036 

C.  1677 

2 

3 - 

1.82  - 

2.67 

O.COOO 

0.5905 

0.4095 

3 

4 - 

1.91  - 

2 . 96 

O.COOO 

O.OOOC 

1 .CCOO 

1 

2 

2 - 

1 . 32 

0.  2 0 24 

0.498C 

C.21 95 

2 

3 - 

1.88 

O.COOO 

0.5314 

0.4686 

3 

4 - 

1.91 

O.COOO 

0.0000 

1 .COCO 

4 

1 

1 

1 - 

1.46 

0.  2 2 F 8 

0.4991 

C.2722 

2 

4 - 

1.91 

O.COOO 

0.47e3 

C.5217 

3 

4 - 

1.91 

O.COOO 

0.0000 

l .COCO 

2* 

4 

1 

2 - 

1 . 32 

C.  2 8 2 4 

0.498C 

C.2195 

2 

2 - 

1.32 

0. 2824 

0.498C 

0.2195 

3 

2 - 

1 . 32 

0. 2824 

0.4980 

0.2195 

2* 

1 

2 - 

1.15  - 

1 . 73 

0.  3487 

0.4836 

C. 1 677 

2 

2 - 

1.15  - 

1.73 

0.3487 

0.4836 

C.  1677 

3 

2 - 

1.15  - 

1 .73 

0. 3487 

0.4336 

0.1677 

TIME 

TIMF 


I I.  11  I 
11.55  I 


M F MHR  Y STATES 
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GH 

If 

1 

1 

-0.94 

-1.48 

0.4305 

0.4513 

0.1183 

2 

1 

-0.94 

-1 .48 

0.4305 

0.4513 

0.1183 

1 

l 

-0.94 

-1.48 

0. 4 305 

0.4513 

0.1183 

GH 

1 f 

1 

1 

-0.68 

-1  . 19 

0.5314 

0.3951 

C .07  34 

2 

1 

-0.68 

-1.19 

0.5314 

0.3951 

0.0734 

3 

1 

-0.69 

-1.16 

0.5314 

o 

• 

o 

v/i 

0.^734 

GH 

1 

1 

1 

-0.50 

-1.0  1 

0.5314 

0.3 222 

0.0468 

2 

3 

-1.65 

-2.58 

o.cooo 

0. 3645 

0.1105 

3 

4 

-1.91 

- 3.  3C 

0.  ccco 

O.CCCO 

0.250C 

GH 

l-» 

1 

1 

-0.36 

-0.83 

0.6561 

0.3078 

0.C361 

2 

1 

-0.36 

-0.83 

0.6561 

0.3078 

0.0361 

3 

1 

-0. 36 

-0.  83 

0.6561 

0.3079 

C.C  361 

1 

1 

1 

-0.14 

0.6561 

0.22te 

C . C 1 9 6 

2 

3 

-1.56 

C.COOO 

9.4Q50 

0.0700 

3 

4 

-1.91 

O.COOO 

o.cocc 

C.25C0 

GH 

1 

1 

1 

-0.30 

-0.82 

0.5314 

0.2566 

0 . 0 3 1 C 

2 

3 

-1.61 

-2 . 54 

O.COOO 

0.3645 

0. ^PQ0 

3 

4 

- 1.91 

-3.38 

0. CCCO 

O.GOOC 

O 

• 

PO 

v» 

o 

o 

G 

4 

1 

3 

-1  .58 

C.COOO 

0.09  ic 

C.0165 

2 

3 

-1.64 

o.cooo 

0.4050 

0.  V 2C0 

3 

4 

-1.91 

0. ccco 

O.COOO 

C . 7500 

GH 

1 f 

1 

1 

0.03 

-0.41 

o. e ioo 

0. 13CC 

C.01CC 

2 

1 

0.03 

-0.4  1 

0 . 8 1 CO 

0. 180C 

0.0100 

3 

1 

0.0  3 

-0.4  1 

C. 81  CO 

0. 1 8CC 

0.C100 

1 

1 

1 

0.29 

0.8100 

0.09CC 

C.C025 

? 

3 

-1.4  7 

0. GOOD 

0 . 45  OC 

0.0250 

3 

4 

-1.91 

0. c^co 

o.cocc 

C .25CC 

1 

1 

1 

0.09 

0.6561 

0. 1539 

C.C090 

2 

3 

-1.52 

o.crco 

0.4050 

C . 04  75 

3 

4 

-1.91 

o.ccco 

o.oo;c 

C.25C0 

TABLE  11.02 
4 1111 


4 


2 


4 


2 


1 


1 


4 


2 


1 


» 

\ 
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MACHINE  MAINTENANCE  £ REPAIR 
GH  1 


1 

1 -0.1C 

-0.62 

0 . 5 3 1 A 

2 

3 -1.57 

-2.  AP 

0. COCO 

3 

A -1.91 

- 3 . A 1 

O.COOO 

A 

1 

3 -1.A9 

0. ccoo 

2 

3 -1.56 

O.COOO 

3 

A -1.91 

O.COOO 

A 

1 

3 -1.56 

O.COOO 

2 

3 -1.68 

0. coco 

3 

A -1.91 

O.COOO 

A 

1 

3 -1.63 

-2.31 

o.ccoo 

2 

1 -1.7A 

-2.31 

o.ccoo 

3 

A -1.91 

-2.31 

O.COOO 

1* 

1 

1 0.51 

o.  ic 

1. coco 

2 

1 0.51 

0.  1C 

1. ccoo 

3 

1 0.51 

0.  1C 

1 .coco 

PACE  21 

TABLE 

11.03 

0.1976 
0. 36 A 5 

o.oocc 

0 . 0 1 8 A 
0.0677 
0.2500 

l 

1 2 

0.09C0 

0.A5CC 

O.OOCC 

0.0075 

C.075C 

C.7500 

3 

0.1539 
0. A05C 
O.OOOC 

0.0271 
C. 1A25 
0.7500 

1 

0.1976 
0. 36A5 
O.OOCC 

0.0551 

C.2032 

0.7500 

l 

o.oocc 

O.OOCC 

O.OOCC 

O.COCO 

c.ooco 

C.CCCO 

A 

/ 


MACHINE  MAINTENANCE  L REPAIR 


PACT  22 


TABLE  12.01 


I TERAT ION 

12 

MEM  = 31 

ESS  MEM  = 29 

TIME  = 11.94 

+ 

1 

1 

0.420  < G 

< 

C.425 

18  STEPS 

TIME  = 14.10 

1 

1 

0.317  < H 

< 

C.409 

12  STEPS 

TIME  = 14.38 

1 

♦ 

I 


RC 

I 

U V ( G ) 

V ( H ) 

PRCBS 

MEMORY  STATES 

1 

1 1 

l 

l -0.37 

0.6561 

0.3078 

C . C 3 6 1 

2 

3 -1.60 

O.COCO 

0.81C0 

0. 1900 

3 

4 -1.91 

0. COCO 

O.OOCC 

1 .COOO 

1 

1 

1 

1 -0.69 

0.5314 

0.3951 

C . 07  34 

2 

3 -1.69 

0. COCO 

0.729C 

0.2710 

3 

4 -1.91 

c. cooo 

O.OOCC 

1 .COCO 

1 

1 

1 

1 -0.95 

0.4  <05 

0.4513 

0.1183 

2 

3 -1.76 

O.COOO 

0.6561 

0.3439 

3 

4 -1.91 

O.CCOO 

0.0000 

l .COCO 

1 

1 

l 

1 -1.16 

0.  '497 

0.4836 

0.1677 

2 

3 -1.02 

o.cooo 

0.5905 

C • 4095 

3 

4 -1.91 

C. CCGC 

O.OCOC 

1 .CCCO 

1 

1 

1 

1 -1.32 

0. 2P24 

0.458C 

0.2195 

2 

3 -1.83 

o.cooo 

0.5314 

0.4696 

3 

4 -1.91 

O.COOO 

O.OOCC 

1 .CCOO 

GH 

4 

1 

1 

1 -1.45 

-2.29 

0.2288 

0.4991 

0.2722 

2 

4 -1.91 

-2.29 

O.COOO 

0.4783 

0.5217 

3 

4 -1.91 

-2.25 

O.CCOO 

O.OOCC 

1 .COCO 

GH 

1* 

4 

1 

1 -1  . 32 

-2.14 

0.2924 

0.498C 

C.2195 

2 

1 -1.32 

-2.14 

0.2824 

0.4980 

0.2195 

3 

1 -1.32 

-2.  14 

0. 2924 

0.4980 

0.2195 

GH 

1* 

4 

1 

1 -1.16 

-1.96 

0. 3487 

0.4836 

8.1677 

2 

l -1.16 

-1.96 

0. 3487 

0.4836 

0.1677 

3 

l -1.16 

-1.96 

0. 3487 

0.4836 

0.1677 
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PAGE  2 3 

GH 

i* 

1 

l 

-0.95  -1 

.73 

0.4305 

0.4513 

0.1183 

2 

l 

-0.95  -1 

. 73 

0.4305 

0.4513 

0.1183 

3 

l 

-0.95  -1 

. 73 

0.4305 

0.4513 

0.1183 

1 

l 

l 

-0.P0 

0.4305 

0.3857 

0.0864 

2 

3 

-1.72 

O.COOO 

0.328C 

0.1469 

3 

A 

-1.91 

0. COOO 

O.OCCC 

C.25CC 

GH 

1* 

1 

1 

-0.69  -1 

.41 

0.5314 

0.3951 

0.0734 

2 

1 

-0.69  -1 

.41 

0.5314 

0.3951 

0.C734 

3 

1 

-0.69  -1 

.41 

0.5314 

0. 3951 

0. C734 

1 

1 

1 

-0.51 

0.5314 

0.3222 

0.04eR 

2 

3 

-1.65 

0. COCO 

0.3645 

0.1105. 

3 

4 

-1.91 

o.coco 

O.OCCC 

C.25C0 

1 

1 

1 

-0.64 

0.4305 

0.3266 

0.0620 

2 

3 

-1.69 

G.COOO 

0.328C 

0.1  ^ 44 

3 

4 

-1.91 

o.coco 

O.COCC 

C . 2 > l C 

GH 

1* 

1 

1 

-0.37  -0 

.59 

0.6561 

0.307? 

0.036  1 

2 

1 

-0.  17  -0 

.59 

0.6561 

0. 307? 

0.0361 

3 

1 

-0.37  -0 

.59 

0.6561 

0.307? 

C. C36  l 

1 

1 

1 

-0.15 

0.6561 

0.2260 

0.C196 

2 

3 

-1.56 

O.COCO 

0.4050 

0.0700 

3 

4 

-1.91 

O.COOO 

O.COCC 

0.25CC 

1 

1 

1 

-0.32 

0.5314 

0.2566 

0.0310 

2 

3 

-1.61 

O.COOO 

0.3645 

0.0000 

4 

-1.91 

O.COOO 

0. oocc 

C.25CC 

1 

1 

1 

-0.47 

0.4305 

9.2735 

0.0434 

2 

3 

-1.66 

O.COCO 

0. 328C 

C.  1042 

3 

4 

-1.91 

O.COOO 

O.OCCC 

0.25C0 

1 

4 

3 

-1.58 

O.COOO 

0.08  10 

0.0165 

2 

3 

-1.64 

O.COOO 

0.4050 

0.12CC 

3 

4 

-1.91 

0. CCCO 

O.OOCC 

c.  r 5cc 

T6RLE  12.0? 

<.1111 


2 


<■ 


2 


1 


4 


2 


1 


1 


1 


\ 
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MACHINE  MAINTENANCE  £ REPAIR 


A 


1 

3 

-1.65 

0. COCO 

-> 

1 

-1.71 

o.ccoo 

3 

A 

-1.91 

o.cooo 

1* 

1 

1 

0.03  -0.A9 

0. RICO 

2 

1 

o 

• 

0 

1 

o 

• 

vO 

0. R 100 

3 

1 

0.03  -0.A9 

o. eico 

1 

1 

1 

0.28 

0.8100 

2 

3 

-1  .A7 

o.cooo 

3 

A 

-1.91 

o.cooo 

1 

1 

1 

0.07 

0.6561 

2 

3 

-1.52 

o.ccoo 

3 

A 

-1.91 

o.cooo 

1 


1 

1 

-0.12 

0 • 5 3 1 A 

2 

3 

-1.57 

O.COOO 

3 

A 

-1.91 

O.COOO 

1 


1 

1 

-0.30 

C. A305 

2 

3 

-1.62 

O.COOO 

3 

A 

-1.91 

O.COOO 

A 

1 

3 

- 1 . A 9 

C.COGO 

2 

3 

-1.56 

O.CCOO 

3 

A 

-1.91 

O.COOO 

A 

1 

3 

-1.56 

o.ccco 

2 

3 

-1.67 

0 . coco 

3 

A 

-1.91 

o.ccoo 

A 

1 

3 

-1.63 

o.cooo 

2 

1 

-1.76 

o.cooo 

3 

A 

-1.91 

o.cooo 

A 

1 

3 

-1.79 

o.ccoo 

2 

1 

-1.80 

o.cooo 

3 

A 

-1.91 

o.ccoo 

PACE  ?A 

TABLE 

12.03 

0.1385 
0. 36A  5 
0.0000 

0.0A25 
0. 1330 
0.7500 

l 

3 1 

0. 18CC 
0. 1 8CC 
0. 18CC 

0.01C0 

C.C1CC 

O.OICO 

A 

0.C9CC 

0.A5C0 

O.OOOC 

0.0025 

0.0250 

0.2500 

2 

0.1535 

0.A05C 

0.0000 

C.C05C 
0 .OA75 
0.2500 

1 

0.1976 

0.36A5 

O.OOCC 

C.C18A 

0.C677 

0.2500 

1 

0.2256 
0. 328C 
0.0000 

C.C296 
0 . 0 8 6 0 
0.2500 

1 

0.09CC 

0.A5C0 

O.OOCC 

C.C075 
C .0750 
0.7500 

3 

0. 1535 
0.A05C 
O.OOCC 

C.0271 
C. 1A25 
0.7500 

1 

0. 1976 
0.36A5 
O.COOO 

C.9551 

0.2032 

0.7500 

1 

0.2256 
0. 328C 
0.0000 

C.C8P7 
0.2579 
0. 7500 

1 

\ 
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TABLE  12.04 

GH 

1 

1* 

1 0.51 

o.ce 

1.C000 

o.ooco 

o.coco 

4 

2 

1 0.51 

o.ce 

l.COOO 

0.0000 

0.0000 

3 

1 0.51 

0.08 

1. cooo 

o.oocc 

o.cooc 

\ 
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MACHINE  MAINTENANCE  Z REPAIR  PAGE  26  TABLE  11.01 


ITERATION  13  MEM  = 33  ESS  MEM  = 31  T I MF  = 14.58 


• • 

o o 

421  < G < 0. 

421  < H < C. 

423 

423 

17  STE 
6 STE 

PS 

°S 

TIME  = 16.61  I 

TIME  = 16.84  I 

I 

1 

U V ( G 1 V ( H ) 

l 

1 -0.37 

PRC  0 S 
0.6561 

0.3078 

C.0361 

MEMORY  STATES 
1 1 

2 

3 -1.60 

0.0100 

0.31CC 

C.  1 9 C 0 

3 

4 -1.91 

0. cooo 

0.0000 

1 .0000 

1 

1 

1 -0.70 

0.5314 

0.3951 

C.0734 

1 

2 

3 -1.69 

0. COCO 

0.729C 

C.271C 

3 

4 -1.91 

0. COOO 

0.0000 

1 .0000 

1 

1 

1 -0.96 

0.4305 

0.4513 

c . 1 1 e i 

l 

2 

3 -1.76 

o.cooo 

0.6561 

C . 3439 

3 

4 -1.91 

0. cooo 

O.OOOC 

1 .COCO 

l 

1 

1 -1.17 

C.  34  8 7 

0.4  8 36 

0. 1677 

1 

2 

3 -1.82 

c.cooo 

0.59C5 

C .4095 

3 

4 -1.91 

o.cooo 

0.0000 

1 .cooo 

1 

1 

1 -1.33 

C . 2 p 2 4 

0. 498C 

0.2195 

1 

2 

3 -1.88 

o.cooo 

0.5314 

0.4686 

3 

4 -1.91 

o.cooo 

O.OOOC 

1 .COOO 

1 

4 

3 -1.46 

0.2268 

0.4991 

l 

C.2722 

2 

4 -1.91 

o.cooo 

0 . 4 7 e ? 

0.5217 

3 

4 -1.91 

o.cooo 

O.OOOC 

1 .0000 

1 

4 

3-1.55 

0. 1853 

C.49C3 

l 

C. 3244 

2 

4 -1.91 

O.COOO 

0.43C5 

0.5695 

3 

4 -1.91 

O.COOO 

O.OOCC 

1 .OCCO 

1 

3# 

3 -1 .46  -1 . 5C 

0. 2288 

0.4951 

4 

C.2722 

2 

3 -1.46  -1.9C 

0. 2268 

0.499  1 

C.2722 

3 

3 -1 .46  -1.9C 

0. 2288 

0.4991 

0.2722 

\ 

\ 
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MACHINE 

MAINTENANCE  C 

R E P A I 8 

PAGE  27 

GH 

1* 

4 

1 

1 -1.33  -1.77 

0.2324 

0.498C 

0.2195 

2 

1 -1.33  -1.77 

0. 2«24 

0.4980 

0.2195 

3 

1 -1.33  -1.77 

C. 2824 

0.498C 

0.2195 

GH 

1 

1* 

1 -1.17  -1.61 

C. 3487 

0.4836 

0.1677 

2 

1 -1.17  -1.61 

0. 3487 

0.4836 

0.1677 

3 

1 -1.17  -1.61 

0. 3487 

0.4836 

0.1677 

GH 

1 

1* 

1 -0.96  -1.40 

0.4305 

0.4513 

0.1183 

2 

1 -0.96  -1.4C 

0.4305 

0.4513 

0.1183 

3 

l -0.96  -1.4C 

0.4  305 

0.4513 

0.1183 

1 

1 

1 -0.81 

0.4305 

0.3857 

C .0864 

2 

3 -1.72 

0.0000 

0.3280 

0.1469 

3 

4 -1.91 

C.  COCO 

O.OOCC 

0.25CC 

GH 

1 

1* 

1 -0.70  -1.13 

0.5314 

0.3951 

0.C734 

2 

1 -0.70  -1.13 

0.5314 

0.3951 

0.0734 

3 

1 -0.70  -1.13 

0.5314 

0.3951 

0.C734 

l 

1 

1 -0.51 

0.5314 

0. 3222 

0.3488 

2 

3 -1.65 

O.COOO 

0.3645 

0. 1 1C5 

3 

4 -1.91 

C.COCO 

O.OOCC 

0.2500 

1 

1 

1 -0.65 

0.4^05 

0.3266 

0.062C 

2 

3 -1.69 

O.COOO 

0.3?80 

0.1244 

3 

4 -1.91 

0. coco 

O.OOCC 

C .2500 

GH 

1 

1* 

1 -0.37  -0.81 

0.6561 

0.3078 

C . 0 3 6 1 

2 

1 -0.37  -0.81 

0.6561 

0.307e 

0.0361 

3 

1 -0.37  -0.81 

0.6561 

0.3078 

0.0361 

1 

1 

1 -0.16 

0.6561 

0.2268 

0.U196 

2 

3 -1.56 

O.COOO 

0.405C 

0.0700 

3 

4 -1.91 

C.COCO 

O.OOCC 

C.25C0 

1 

1 

1 -0.32 

0.5314 

0.2566 

0.0310 

2 

3 -1.61 

O.COOO 

0.3645 

0.^880 

3 

4 -1.91 

O.COOO 

O.OOCC 

0.2500 

TABLE  13.02 
l 1 1 1 1 


2 


A 


2 


1 


A 


2 


1 
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MACHINE 

MAINTENANCE  c 

REPAIR 
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TAPLE 

13. 

03 

1 

1 

1 

2 

1 

1 

1 

1 

o 

• 

■C* 

00 

0. A305 

0.2735 

0.CA3A 

2 

3 

-1.66 

0. CCOO 

0.328C 

0. 10A2 

3 

* 

-1.91 

0. COCO 

o.oocc 

0.25CC 

4 

1 

1 

3 

-1.58 

0.  COOO 

0.0810 

0.0165 

2 

3 

- 1 .6A 

0.C000 

0 . AO  50 

0.1200 

3 

4 

-1.91 

0.  CCOO 

o.oocc 

C. 75CC 

4 

1 

1 

3 

-1.65 

o.cooo 

0.1385 

0 . 0 A 2 5 

2 

1 

-1.71 

o.cooo 

0. 36A5 

0.1830 

3 

A 

-1.91 

o.cooo 

O.OOCC 

C. 75CC 

GH 

1* 

A 

1 

1 

0.03 

- 0 . A 1 

o. eioo 

0.  I HOC 

0.01C0 

2 

1 

0,03 

-0  . A 1 

0. 81  CO 

0. 18CC 

0.0100 

3 

1 

0.03 

-0.  A 1 

o. Pino 

0. IRCC 

C.C1CC 

1 

2 

l 

1 

c r 

CM 

• 

O 

0. 8 100 

9.09CC 

0.0025 

2 

3 

- 1 .A7 

o.cooo 

0. A5CC 

0.G25O 

3 

A 

-1.91 

o.cooo 

O.COOO 

C.25C0 

1 

l 

I 

1 

0.07 

0.6561 

0.1539 

0 . CO  90 

2 

3 

-1.52 

0.  CCOO 

0. A05C 

C. OA  75 

3 

A 

-1.91 

o.cooo 

O.OOCC 

C.2500 

1 

1 

1 

1 

-0.13 

0. 5 3 l A 

0.1 976 

C.018A 

2 

3 

-1.57 

c. coco 

0. 36A5 

C.0677 

3 

A 

-1.91 

o.cooo 

O.COOC 

0.2500 

1 

1 

I 

1 

-0.  10 

0.  A 305 

0.2256 

0.0296 

2 

3 

-1.62 

c.  croc 

0. 32  PC 

0.086C 

3 

4 

-1.91 

o.cooo 

O.OOCC 

0.2500 

A 

3 

l 

3 

- 1 .A9 

0.  coco 

0.09CC 

C .0075 

2 

3 

-1.56 

o.cooo 

0. A5CC 

C . 07  50 

3 

A 

-1.91 

o.cooo 

O.OOCC 

0. 75CO 

A 

1 

1 

3 

-1.56 

o.ccoc 

0.1539 

C.0271 

2 

3 

-1.67 

o.cooo 

0.A05C 

C. 1A25 

3 

A 

-1.91 

o.cooo 

O.OOCC 

C.750C 
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tarle 

13.04 

4 

l 

1 3 

I 

3 - 

1.63 

0. coco 

0. 1976 

0.0551 

2 

l - 

1.76 

0.0000 

0. 3645 

0.2032 

3 

4 - 

1.91 

0. cooo 

0.0000 

0.75C0 

4 

l 

1 

3 - 

1.70 

0. coco 

0.2256 

0 . C p 8 7 

2 

1 - 

1.80 

0.0000 

0 . 3280 

0.2579 

3 

A - 

1.91 

o.cooo 

O.OOCC 

0.7500 

CH 

1* 

4 

1 

1 

0.51 

0 . C 7 

1. ccoo 

O.COCC 

C.CCCC 

2 

1 

0.51 

0 . C 7 

l.CCOO 

O.OOCC 

C.CCCC 

3 

1 

0.51 

0.C7 

1 .cooo 

0.0000 

O.OOCC 
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b.  A Computer  Communication  Problem 

The  problem  to  be  considered  in  this  subsection  concerns  several 
units  sharing  a single  communication  channel.  If  any  two  units  attempt 
to  transmit  messages  simultaneously,  both  will  fail.  As  the  units  have 
no  means  (other  than  the  channel  itself)  of  coordinating  their  efforts, 
the  decision  to  transmit  is  made  on  the  basis  of  imperfect  information. 
A system  of  this  type  has  been  used  to  link  remote  terminals  to  a 
central  computer  at  the  University  of  Hawaii;  because  this  system  is 
called  the  ALOHA  system,  the  problem  has  become  known  as  the  slotted 
ALOHA  problem.  A more  familiar  example  of  this  problem  is  that  faced 
by  a newsman  attempting  to  address  the  President  of  the  United  States 
at  a news  conference;  if  he  asks  a question  while  another  newsman  is 
doing  the  same,  neither  will  be  recognized. 

The  slotted  ALOHA  problem  has  been  considered  by  Kleinrock  and 
Lam  [1975],  Lam  and  Kleinrock  [1975],  and  others  cited  in  the  first 
reference.  Although  the  problem  has  been  extensively  studied  under 
the  assumption  that  the  number  of  units  seeking  to  transmit  is  known 
(to  all  units) , no  work  known  to  this  author  considers  the  "dual  con- 
trol" aspect  of  the  problem  (characterized  by  the  fact  that  clashes 
are  useful  in  identifying  the  number  of  units  seeking  to  transmit). 

The  formulation  to  be  considered  here  limits  the  number  of  units,  but 
recognizes  the  "dual  control"  aspect  of  the  problem.  Moreover,  pre- 
vious work  resulted  in  strategies  sufficiently  complex  to  preclude 

evaluation,  even  by  simulation.  In  the  present  analysis,  the  system 
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under  an  adapted  feasible  strategy  is  a Markov  chain  having  state  set 
SxM;  exact  evaluation  of  the  controller  performance  is  therefore 
possible . 

In  the  model  to  be  considered  here,  there  are  four  units,  each 
of  which  may  be  in  idle  or  retransmit  mode.  During  each  time  interval, 
a message  originates  at  an  idle  unit  with  probability  .1.  The  unit 
always  attempts  to  broadcast  a newly-originated  message.  The  three 
outputs  are: 

(No  transmissions  attempted  ) 

One  successful  transmission  ' . 

| Multiple  transmissions  attempted  | 

A unit  that  has  unsucessfully  attempted  to  transmit  subsequently  enters 
retransmit  mode.  It  then  selects  an  input 

I Retransmit  with  probability  .21 

JJ  — ' 

jRetransmit  with  probability  .9j 

Since  the  system,  as  viewed  by  a unit  in  retransmission  mode,  is 
symmetric,  all  units  select  the  same  input  on  the  basis  of  the  same 
input-output  history.  There  results  an  FPS  formulation  having  3 tr  > 
(corresponding  to  the  number  of  units  in  retransmit  mode').  It;  • - 
and  3 outputs.  The  FPS  is  reachable  and  detectable.  The  m. 

measure  is  throughput,  i.e.  the  average  number  of  m*  ■ 
transmitted  per  unit  time. 


V 
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The  following  results  were  obtained  in  four  iterations: 


h 

lb 

ub 

g 

lb 

ub 

essential 

memory 

effectiveness 

time 

(secs) 

.302 

.354 

.330 

.372 

i 

> 91.2% 

.39 

.309 

.331 

.332 

.336 

6 

> 93.3% 

1.54 

.313 

.329 

.331 

.332 

26 

_>  94.6% 

5.46 

.312 

.330 

.330 

.331 

98 

> 94.3% 

23.49 

"Effectiveness"  was  computed  by  comparing  the  lower  bound  on  h with 
the  final  upper  bound  on  feasible  performance,  .331. 

These  results  indicate  that  memory  is  not  very  useful  for  pur- 
poses of  decision-making  in  this  problem,  i.e.  that  the  performance 
that  may  be  achieved  on  the  basis  of  the  most  recent  input-output 
pair  alone  (iteration  2)  is  comparable  to  that  which  may  be  achieved 
on  the  basis  of  an  infinite  past  history.  This  might  be  attributed  to 
the  small  number  of  units  involved;  it  is  possible  that  a similar  com- 
putation with  a larger  number  of  units  might  yield  entirely  different 
results. 


i 

I 


I 
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CHAPTER  V 


CONCLUSIONS 

The  mathematical  technique  of  dynamic  programming  assigns  to  each 
state  a value  representing  the  expected  rewards  accrued  when  the  system 
is  initiated  in  that  state.  A decision-maker  uses  these  values  to  com- 
pare immediate  rewards  with  potential  benefits  if  the  system  is  made  to 
enter  a desirable  state. 

Problems  of  decision-making  under  state  uncertainty  may,  in 
principle,  be  solved  by  dynamic  programming,  if  the  state  of  information 
is  itself  considered  to  be  a state.  It  may,  however,  be  practically 
infeasible  to  assign  a value  to  each  state  of  information,  when  the 
number  of  possible  states  of  information  is  sufficiently  large. 

The  mathematical  technique  of  perceptive  dynamic  programming 
assigns  a value  to  certain  information  that  might  be  acquired  at  a 
cost.  These  values  may  be  used  to  compare  performance  achievable  on 
the  basis  of  existing  knowledge  with  potential  benefits  if  further 
information  is  sought. 

In  this  report,  perceptive  dynamic  programming  has  been  developed 
in  the  context  of  control  of  finite  probabilistic  systems  over  an 
infinite  horizon.  The  system  is  assumed  to  be  reachable . so  that 
performance  will  not  depend  on  the  initial  state,  and  detectable , so 
that  performance  will  not  depend  on  the  initial  state  of  information. 
Specifically,  reachability  assures  that  the  most  desirable  state  can 
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be  reached  from  any  other  state;  hence  the  gain  achievable  when  the 
system  Is  Initiated  In  the  most  desirable  state  can  be  replicated  when 
the  system  starts  In  any  other  state.  Detectability  assures  that  the 
Information  vector  may  be  arbitrarily  closely  approximated  on  the  basis 
of  a sufficiently  long  string  of  most  recent  input-output  pairs;  hence, 
whatever  information  was  available  initially  is  irrelevant  in  the  steady- 
state.  Reachability  and  detectability  also  imply  that  a performance 
arbitrarily  close  to  the  supremum  feasible  performance  may  be  achieved 
by  a finite-memory  controller  having  a sufficiently  large  memory  set. 

Reachability  and  detectability  have  many  implications  in  FPS's 
that  are  similar  to  well-known  properties  of  finite-dimensional  linear 
systems  (FDLS) . For  example,  detectability  in  a FDLS  implies  that  the 
observer  state  may  be  arbitrarily  closely  (in  some  suitable  sense) 
approximated  on  th*  sis  of  a sufficiently  long  string  of  most  recent 
input-output  pairs.  The  analogous  result  for  FPS's  is  given  in  Section 
14.  Moreover,  any  FDLS  that  is  initiated  in  state  zero  may  be  expressed 
in  a form  that  is  controllable  and  observable.  The  assumption  that  a 
FDLS  is  initiated  in  state  zero  is  equivalent  to  the  assumption  that  it 
has  experienced  an  infinite  past  under  a stablizing  control.  Similarly, 
any  FPS  that  has  experienced  an  infinite  past  under  an  appropriate 
decision  strategy  may  be  expressed  in  a form  that  is  reachable  and 
detectable. 

An  algorithm  for  the  solution  of  FPS  control  problems  was  implemented 
on  a digital  computer,  and  two  simple  problems  were  "solved"  to 
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demonstrate  the  efficacy  of  the  method.  It  appears  that  more  realistic 
(and  hence  more  complex)  problems  might  be  solved  in  the  same  manner, 
but  it  would  then  be  necessary  that  the  computer  implementation  be  pro- 
blem-specific. 

Possible  extentions  of  the  theory  which  would  be  beneficial  in 
extending  its  applicability  include  the  following: 

1)  The  recursive  computation  of  memory  sets  (described  in  Section 
21d)  could  be  explicitely  optimized  (e.g.  by  means  of  a branch-and- 
bound  intepretation) . 

2)  The  computational  efficiency  of  pseudo-perceptive  dynamic  pro- 
gramming (described  in  Section  21b)  might  be  compared  with  that 
of  perceptive  dynamic  programming.  It  is  clear  that  pseudo- 
perceptive  dynamic  programming  converges  less  rapidly  than  does 
perceptive  dynamic  programming,  but  the  former  requires  less 
memory  and  less  time  to  complete  an  iteration. 

3)  Perceptive  dynamic  programming  is  most  effective  when  the 
index  of  detectability,  a,  lies  near  zero.  In  order  for  this  to 
occur,  outputs  need  not  yield  good  reliable  state  information; 
they  simply  must  preclude  the  possibility  of  better  information 
being  acquired  from  less  recent  input-output  pairs.  Thus  the 
notion  of  detectability  is  useful  in  determining  whether  a given 
problem  may  be  solved  numerically.  If  the  problem  cannot  be 
solved,  then  the  notion  of  detectability  might  be  useful  in 


suggesting  a different  observation  structure,  one  that  is  more 
conducive  to  solution.  In  particular,  the  following  problem  might 
be  posed:  Determine  outputs  for  a given  underlying  process  such 

that,  when  perceptive  dynamic  programming  is  performed  up  to  a 
maximum  allowable  memory  size,  feasible  performance  is  maximized. 

An  output  that  happens  to  equal  the  optimal  input  given  the  state 
would,  of  course,  solve  this  problem. 

A)  The  notions  of  reachability  and  detectability  might  be 
extended  to  systems  having  a large  state  set  and  a great  deal  of 
structure  (e.g.  routing  in  a network  of  queues).  This  could  lead 
to  effective  rules  for  decision-making  on  the  basis  of  imperfect 
state  information  when  consideration  of  the  exact  state  is 
physically  feasible,  but  precluded  on  grounds  of  complexity. 

5)  Notions  of  cross-reachability  and  cross-detectability  might 
be  defined  in  decentralized  systems,  to  indicate  the  extent  to 
which  various  decision-makers  need  to  coordinate  their  efforts. 
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APPENDIX  A 


Proof  of  Theorem  19.3 


a.  Preliminaries 

V has  been  defined,  in  (12.11),  as  the  vector  space  of  bounded, 

continuous,  real-valued  functions  on  11^  . V,  along  with  the  sup  norm 

||  • ||  , is  a Banach  space.  It  will  be  shown  that  the  sequence  (vm)  , 

given  by  (19.6)  or  (19.7),  is  bounded,  that  it  has  a subsequence  that 

* 

converges  (pointwise)  to  a convex  function  v , that  the  subsequence  is 

* * 

Cauchy  - implying  v eV,  and  finally  that  v satisfies  (19.1).  A corollary 

states  that  {vm}  itself  is  Cauchy  in  V,  i.e.  that  vm  converges  uni- 
* 


formly  to  v . 

I*  A \ 

Since  it  cannot  be  shown  immediately  that  v is  continuous,  iv  ) 
will  be  treated  as  a sequence  in  W,  the  vector  space  of  Lebesque  measur- 
able functions  on  . If  veW,  then  ||  v||  denotes  the  ess  sup  norm  of  w. 
Naturally  \CW. 

* 

By  abuse  of  notation,  a constant  (such  as  Q or  g ) may  denote  an 
element  of  V that  is  a constant  function  over  11^.  Following  (17.3), 
veW  may  be  interpreted  as  a function  on  : 


v is  "convex"  (over  H^) 


<=>  v(tt)  + v(ir')  > v (tt  + tt ' ) , 


V tt  ,tt  * ,it  + rr'efL  . 

W is  partially  ordered  by  "<"  where: 


(A.  1) 
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v < v'  <=■=>  v(tt)  v(tt)  VTTell^  (A. 2) 

It  will  also  be  necessary  to  consider  the  restriction  of  veW  to 
particular  subsets  of  IL^  that  include  the  range  of  T(*,z)  when  P(z^) 
is  subrectangular.  Define: 

b(£)  * min{Tj  (e^z)  : P (z)  > 0,  P(z)  is 

£ 

subrectangular,  and  zeZ  } (A. 3) 

IL.(b(£))  ■ (irel^  : e^thet  ir^  = 0 or  _>  b(£),  ViES} 

(A. 4) 

HvftbO>)  = SuPTTEnM(b(£)){v(lT)}  (A'5) 

N 

b.  A Transformation  in  W 

(A. 6)  Definition,  f : W -*■  W is  defined  by: 

fv(Tt)  = ®axueU  {irq(u)  + BEyeY  v(TTP(y|u))} 

Interpretation:  f is  the  operator  of  backward  inductive  dynamic  pro- 

gramming. 

* * * 

Remark:  Eq  (19.1)  may  now  be  expressed  as  v = fv  - g 


Transformation  f has  the  following  properties: 
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(A.  7)  Lemma,  v <_  v'  =■=>  fv  _<  fv' 

(A. 8)  Lemma.  f(v  + C)  = fv  + BC,  where  C is  a constant. 

(A. 9)  Proposition,  f is  continuous  in  sup  norm;  in  particular, 

||  fv  - fv'  ||  £ 6 1|  v - v'  ||  . 

Proof : fv  £ f(v'  + 1 1 v — v * 1 1 ) = fv'  + B 1 1 v - v*  || 

and  similar lv  fv  j>  fv'  - $||  v - v*  ||  . + 

(A. 10)  Proposition.  veV  ==>  fv£V;  i.e.  f preserves  continuity  in  v. 

(A. 11)  Proposition.  If  v£W  is  convex,  then  fv  is  convex;  i.e.  f 
preserves  convexity  in  v. 

Proof : 

fv(rr)  + fv(S') 

= max  „ (Tfq(u)  + BE  w(SP(y|u))} 
ueU  y£i 

+ maxuEl]  (S'q(u)  + BEy£Y  w(T?'P(y|u))  } 

_>  maxueU  {(W  + Tf')q(u)  + 8Ey£Y  [w(Sp(y |u))  + w(S'P(y|u)) 

> ®axueU  US  + S')q(u)  + BEyeY  [w((S  + S')P(y|u))]} 

t 


fw(S  + S’ ) . 
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Adopting  the  notation  (14.19),  multiple  applications  of  f take 


the  form: 


fkv(Tr)  = max  (k-1)*  °t— ^ ( z)  9 (4>  (z ! 

A«~1T  (Z  ) I zeZ^ 


+ 6k/E  o[*,<»v0rP(z») 

' *f7. 


Continuity  of  f,  established  in  (A. 11),  is  made  stronger  below. 
This  will  be  necessary  in  order  to  establish  convergence  of  {v  } in 
FPS's  that  satisfy  only  a condition  of  weak  detectability. 


(A. 13)  Proposition.  ||  f 


kv  - fkv*  II  < 


(1  -c.k:t)Sk||v-»Mlb(k)  +«kit  6k|kv' 


Proof : For  any  e>0,  there  is  a such  that 

II  fkv  - fkv' ||  < fkv(TT)  - fkv'(Tt)  + e 


(7k 

Let  $eU'  ' be*  the  policy  maximizing  (A.  12),  where  ti  is  as 


described  above.  Now: 


II  fkv  - fkv' II  - e < fkv(TT)  - (Z  (k-1)* 

ztZ 

7TP(z)q(<0(z)))  + Pk(z  k 0[z,<Mv'(TTP(z))) 
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- B E . 0[z,*]tv(TTP(z))  - v’(TTP(z))] 
zeZ 


- B E a[z,<p](TrP(z)l)[v(T(n,z))  -v' (T(n,z))  ] 
zeZ 


<BE  a[z,<|)](7iP(z)l) 
zeZ 


<8E  o[z,(|>](7rP(z)l) 

zeZK 


< B E k o[z,4i](tvP(z)1) 
zeZ 


II  v-v*  II  b(k) » 

if  T(ir,z)eIIN(b(k))l 

II  v-v'  ||  , 

otherwise  ' 

II  v-v’llb(k)’ 

if  P(z)  is  subrectangular 

II  V-v' ||  , 

otherwise 

II  V‘V’II  b(k)' 

if  a[z]  < l) 

II  v-v'  ||  , 

otherwise 

< BkE  k o[z,4)](7TP(z)l)  [(1  - a[z])  ||  v-v'  ||  fe(k)  + a[z]  ||  v-v'  ||  ] 


zeZ 


< 8k(l  - a kiJ  ) ||  v-v'  ||  b(k)  + a kTil  ||  v-v'  ||  . 


Taking  the  limit  e -*•  0 completes  the  proof.  "t 

c.  A sequence  in  V 

(A. 14)  Definition.  {v®}  and  {v™}  are  sequences  in  W defined  by 
m+1  , m 

v0  “ fvo  ‘ 


v"*1  - 1/2  vm  + 1/2  fv“, 
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Clearly  (A. 14)  is  consistent  with  (19.5).  By  (A. 10),  v“  and  v™ 
lie  in  V,  and  by  (A. 11)  they  are  convex.  Boundedness  of  {vm}  is  now 
established. 


(A.  15)  Lemma,  v”  + B'Vb.WQ^  < v^k  < v“  + 0mL(6,k)Qmax 

Proof:  By  (A.14),  L(B,k)<^ln  < Vk  < LCB.k)^  . (A. 7)  and  (A. 8) 

complete  the  proof. 

(A.  16)  Lemma.  II  vqII  D 1 n • 


Proof:  (By  induction).  The  result  is  trivial  for  ra=0,  and  follows 

trivially  from  (A.  15)  for  me<0,£p+T>  . 

The  induction  follows  a plan  given  in  the  heuristic  justification 
of  (19.3).  Let  j be  a state  that  maximizes  vm(e^)  and  let 


4>*eU 


(Z<4-1>*) 


be  a policy  that  maximizes  (A.  12)  when  TT«e^  and  v«v 


Now,  for  any  > and  any  me<£,®> 


®/  Jx  HI,  V 

vo(e  > - vo(lr) 


1 vJ(eJ)  - (l  0l£.,4'*]Bi(-)TrP(z)q(^(z))) 


~ ^ J 0[£,4>*]v“~£(TTP(^))^ 


zeZ 


s © 
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< L(8,Jl)Q  + 6^  I j 0[z,<J>*]  [v“”2,(eJP(£))  - v”  *■  (ttP(z))] 
zeZ 


<_  L(6,fc)Q  +6*ir.  E j a[z,<J>*](eJP(z)l) 
ztZ 


[v®"4  (T(ejfz))  - v”-*  (T  (n  , z) ) ] + 81  (l-^)ll  vm"*'||  D 


< L(M)Q 


TT.U  j o[z,(|>*](eJP(z)l)a[z]|  + 
' leZ 


D 


1 L(6,Jl)Q  + B^l  - TTj  (l-a)  ] ||  v“  A||  D 


But 


, for  any  TTell^  * there  is  an  input  word  ueU  such  that 


y T.  p (y|u)  > l-P 

LieSL  o(G)  irZI-  ' 
yeY  — 


Thus 


m+£(G) 

V0 


00  < L(B,Jt(u))Qinax  + B£(-v“(ej) 


and 


vmf«.(u)(ii)  > L(e,«,(fi>)Qmln  + 


e1® 


v“OfP(zlH)) 
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> ue.iuG))^  + e£(-  v”otz  p(^|fi)) 

yeY 

> LCB.JKu))^  + e£(u)  [v“(eJ)  - L(6,I)Q  - $J[l-(l-p)(l-a)] 

ii  -rTii  J • 


Using  (A. 15), 


mf£  & —t(u)  o /*% 

11*0  ”11  0 < L<B,t  -««»<)  ♦ 6 P ll*0 


l +i 


< L(B,l  +T)Q  + 6 p tl-d-P)  (1-a)  1 II  v0 


m-H  i 


m | £ + £ 

and  ||vB||  <B->  ||  v P ||  „ < « 


(A. 17)  Proposition.  ||  vm||  me<0,®> 


Proof : By  (A. 14), 

w>(;)Mk’S 

So  (A.  16)  implies  ||  v“||  D - 0.  (12.12)  completes  the  proof.  + 
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d.  Construction  of  a Convergent  Subsequence 

(A.  18)  Lemma.  There  Is  a subsequence  {vm^^ } of  {v”}  having  the 
following  properties: 

(a)  {*m(k) } converges  polntwise  to  a convex  function  w*cW. 

(b)  lim^  ||  v“(k)  - w*||  b(A)  “ 0,  Vte<T,-> 

Proof:  Theorem  10.9  of  Rockafellar  [1970]  states  that  any  bounded  sequence 

of  convex  functions  on  a relatively  open  set  has  a subsequence  that  con- 
verges uniformly  on  closed  subsets  of  its  domain,  {v  } is  bounded,  by 

(A.  17).  Consider  the  restriction  of  (v  } to  IL^  ■ {nell  : > 0 iff  ieH), 

H H 

for  some  ICS.  One  of  the  following  must  hold:  is  empty;  11^ 

H m 

contains  exactly  one  point;  or  1^  is  relatively  open  (in  R ).  In  each 
case,  there  exists  a subsequence  of  {v  } that  converges  polntwise  on 
IT  and  uniformly  on  closed  subsets  of  11^  . For  any  , 

I^(b(I))0]^  is  closed.  Taking  subsequences  of  {v™}  recursively  for 
each  RCS,  the  desired  subseqence  is  obtained.  + 

(A. 19)  Proposition.  There  is  a subsequence  of  {v”}  that  converges  in 
(V,  ||  *11  )»  uniformly  on  11^  . 

Proof:  Define: 


w®*^  ■ 1/2  w1  + 1/2  fwm 


0 

w • w*  . 
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Let 

for 


and 


Thus 


{®(k)>ke<0  „>  be  the  sequence  of  indices  derived  in  (A. 18).  Then, 
any  e>o,  there  is  aK'  such  that: 


e»e<  = niI » 


< e/8 


a K"  such  that: 


b(m(K' ) ) 


1 e/8, 


Vke<K",«» 


By  (A. 9),  if  m>m,  then 


w™ ||  < ||0fi+,n(k,) 


, for  k,k'  _>  K = max(K'  ,K")  , 


||  ^m(k)+m(k')  _ -®(k)  y 

< ||  0m(K)+m(k')  _ am(K)  y 


i K<o,.«)>C®)«'!)'  5 m‘T]  II  «"(K)-«0II  b(„<K>> 
♦ [£«<0..(K)>  (T)w^  S ”*T  ]ll  «”(K,-II 


^m(K) 


aO  II 

-w  II 


+ Z 


b(m(K)>  I me<0,m(K)>  \ 


/m(K)  \ 


(1/2) 


m 


2n 


m 


< e/8  + e/8  - e/4 


-211- 


and 

||  w“(k)_$m(k')  || 

< ||  ^mCk^rnC^-H^k')  ||  + ||  ^m(k)+m(k' )_^m(k' ) || 

<_  e/4  + e/4  - e/2  . 

But  now: 

||  .2m(k)_.2m(k’)|| 

< ||  0m(k)+m(k)_,jm(k)  ||  + ||dm(k)_am(k,)|| 

+ |J  ^n>(k')_^m(k,)+,n(k,)|| 

<_  e/4  + e/2  + e/4 
- e . 

Consequently  {$2m(k)j  is  a Cauchy  sequence  in  (V,  ||  • ||  ) . t 

(A. 20)  Proposition.  If  v*eV  is  a limit  point  of  {vm}  then  v*  satisfies 
(19.1). 

Proof : Define: 

m+l  , m . . <„  c ® 
w » 1/2  w + 1/2  fw 

0 
w 


V* 
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Then,  by  (A. 9),  v*  Is  a limit  point  of  {w®}. 

. aHI  A. 

that  w = v*  . 

Define: 


a)  tm(Tr)  - w®*1^)  - w®(tr) 


b)  m ““irciin  {tI“(7T)} 


It  will  now  be  demonstrated 


c)  Rm  - (irell^  : tm(ir)  = tm} 


Since  v*  is  a limit  point  of  {w®},  it  follows  that  t°  is  a limit 
point  of  {t®}  and  t°(Tr)  is  a limit  point  of  (t®(ir)},  Virell^  . 

Now  tm  - w^^w®  = l/2[fw®-fw®_1]  + l/2[w®-w®-1]  < l/2[fw®-fw®-1] 

1/2  t®-^  . Thus,  by  (A. 9),  tm  <_  t®  ^ . Since  t°  is  a limit  point 
of  {t  },  t = t . 

By  the  Weierstrass  maximum  theorem,  Rm  is  nonempty.  But 

tm  - 1/2 [fw®  - fw®"1]  + 1/2 [wm  - w®"1]  < 1/2  t°  + 1/2  t®'1  , by  (A. 9). 
Thus  R™  C R®  • Since  t™  = t°  , there  is  a uell^  such  that 

tm(TT)  ■ t°,  Vme<0,“»  . Suppose  now  that  there  exists  a ir'ell^  such 

that  t°(Tr')  ^ t°  and  define 

e - t°  - t°(Tr’)  > 0 

Then  w®(ir)  ■ mt°  + v*(tt)  and  w®^')  < (t°-£)  + (m-l)t°  + v*(tt'). 
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Hence 

[wm(TT)  - V*(TT)]  + [v*(TT')  - wm(TT')] 

■ [wm(7l)  - V*(TT)]  + [v*(Tl')  “ wm(TT')] 

_>  mt  - t + G - (m-l)t  - G 

But,  for  some  me<l,°°>  , ||  wm-v*||  < g/2  , since  v*  is  a limit  point 

of  {vm}  . This  is  a contraction;  hence  t°(iO  “ t°  , VireH^  • Now 

W1  . w°  + t°  . Identify  8*  • 2t°  to  see  that  v*  satisfies  (19.3). 

t 

e.  Summary  and  Proof  of  (19.3) 

By  (A. 19)  {vm}  has  a limit  point  v*  in  V.  By  (A. 20),  v* 
satisfies  (19.3). 

By  (A. 9),  ||vm+1-v*||  <||vm-v*||,  and  hence  {vm}  converges 

in  (V,  ||  • ||  ) , i.e.  uniformly  on  , to  v*.  Thus  v*  is  continuous. 
Since  each  v“  is  convex,  it  follows  that  v*  is  convex. 
Boundedness  of  v™  is  a consequence  of  (A. 17). 
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APPENDIX  B 

Proof  of  Theorem  21.6 


a.  Proof  of  Part  (a) 

First  consider  the  discounted  case,  $<1. 

Define  Y™  to  be  a strategy  which  selects  Inputs  optimally  on  the 
basis  of  a finite  number  of  delayed  state  perceptions,  taking  the  form: 


ym(k) 


[s(k-fc(z(k))) , y(k) ] , if  _z(k)eess[M]  and  ke<0,m-l> 
[yOO],  otherwise 


(B.l) 

Then  the  inputs  prescribed  by  Y™  at  times  ke<m-l,°°>  take  the  form 
<t'*[nm(k)]  where  <f>*  is  the  optimal  feasible  policy  corresponding  to 
the  solution  of  (19.1),  and 


nm(k) 


T(iT(0),z(k)),  if  ke<0,m-l>  and  z(k)tfess[M] 
T(e8(k~£(-(k))),z(k)) , if  ke<0,m-l>  and  z(k)eess[M] 


m 


T(n  (k-l) , u(k-l) ,y(k) ) , 


otherwise 


(B.  2) 
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Note  that  (n  (k) } is  the  information  vector  process  which  results  when 
the  observation  process  is  {ym(k) } . 

Also  define  strategy  ym  , which  selects  inputs  (u(k)},  . , ^ 

ke<u 

according  to  y10"^  and  inputs  (u(k)  }ke<m,°°>  according  to  ym  . 

Then 


g(B,ym)  < g(8,yni)  (B.3) 

since  ym  maximizes  g(B,*)  over  the  set  of  strategies  realizable  on 
the  basis  of  observations  (B.2).  Thus 

g(8,ym+1)  - g(B,yin) 

1 g^.y"*1)  - g(B,ym) 

■ (1-6)IEyW.i(!:k.o  6‘r<k»  - E-»(Wkr<k))1 

- <1-8)8"^  1%,  ek-Vk»  - 8k-”r(k» 

Y Y 

- (l-6)em[E  {v*(nm+1(m) ) } - E (v*(nm(m))}] 

Y Y 

- ( 1—3) 6™  e m+1  {v*(nmfl(m))  -v*(nm(m))} 

Y 

< (1-6) B111  E ^ {A[nm+1(m)  ,0(m)  ] } ||  v*||  A (B.4) 


If  m^O  or  (with  probability  one)  £(m-l)tfess[M] , then 


g(B.Ym)  - g(B,Y®+1). 


Otherwise 
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“ nm(m)  > if  z(m)tfess[M] 


nri-1,  . s(m-£(z(m)))  ... 

r)  (m)  - T(e  — ,z(m))  and 


nm(m)  = T(l(e' 


s(m-l-£(z(m-l))) 


, z(m-l-£(z^ra-l)) ; 


if  z(m)eess[M] 


so  ,nm(m)  ] < 


ct[jz(m)],  if  z^m)eess[M] 

0,  otherwise 


£ [M]U 

< a , by  (14.23). 


Substitution  into  (B.4)  yields 

..  £ . [M]tI 

g(8,Y  ) - g(6,Ym)  1 (1-8)8™  a min  ||v*| 


£ [M] 

Now  g[M]  - g(8,Y  ]»  and  g*  = g(B,Y°)  = g(8,Y  ™ ° )• 


v*llAl4fi  by  (12.16)  and  (19.3).  Thus 


g[M]  - g*  1 ^,£  [M]  S^.Y^1)  - g(B,Ym) 

min 


£ , I M]  £ . [M]tJ 
a min  — min  4ft 

6 a 


m-£(z/mj))j,z(m)j/ 
(B . 5) 


(B.6) 

Moreover 


(B.  7) 


Take  the  limit  8+1  to  prove  (21.6) (a)  in  the  undiscounted  case.  + 
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b.  A Bound  on  Perceptive  Values 

The  following  intermediary  result  will  be  required: 

Iv^fi.zJ  - v^ti'.^']!  ft  , V[i,zJ,  [i'  ,z.*  ]eX[m]  • Intuitively,  this 

M 

must  be  true  in  the  limit  as  SL  . [M]  -*■<»  , for  then  v [i,z]  -*■  v*[T(i,z)  ] 

min  — — 

and  by  (19. 3) (c),  |v*[0]  - v*[n']| 

M M 

In  order  to  bound  v [i,z_],  attention  will  be  focussed  on  v [tt , z^]  , 

which  is  defined  by  (21.3).  The  pair  [tt,_z]  may  be  regarded  as  a gener- 
alized perceptive  state , signifying  that  input-output  word  z has  evolved 
since  the  information  vector  was  known  to  equal  tr.  Naturally 

v»[i,z]  = vV.z]  (B.8) 

— M 

The  following  additional  properties  of  v are  readily  established. 


(B.9)  Lemma. 
(B.10)  Lemma. 
(B.ll)  Lemma . 


- M 
v 


[n ,z]  is  convex  in 

r—  M 1 

< maXj^glv  (e 

[tt,z]  > min^ ^giv  M[eJ 


it,  for  any  zeM. 
,e]}. 

,e] } . 


Proof : The  relative  value  of  being  in  the  generalized  perceptive  state 

[■n.zj  can  only  decrease  if  certain  information  is  withdrawn.  An  observer 
in  generalized  perceptive  state  [tt.zJ  at  time  k perceives  information 
of  the  form 


j[s(k-H(z(k,))),y(k’)], 

(ly(k')], 


if  k'-«.(z(k'))  > k-£(z) 


| 

1 


and  £(k')eess[M] 
otherwise 


k'e<k,00> 
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whereas  an  observer  In  generalized  perceptive  state  [T[tt ,£  ] ,£]  at  time 
k,  perceives  information  of  the  form 

[sfk-JKz’Ck’))),  y(k') J , if  k'-£(z(k'))  > k] 

and  z/k' )eess[M] 


[y(k')]. 


otherwise 


k'e<k,°°> 


Since,  in  the  former  case,  more  information  (specifically,  perception 
of  states  s(k' ) ,K'e<k+l-J!.(z)  ,k>)  in  available,  it  follows  that 


v M [tt.z]  > v M (T(tt,£)  ,e]  _>  v M [T(tt,z)  ,e] 


~min7T'enMfvM 

N 


(B.  12)  Lemma.  II  v M [ • ,z]  ||  Q <_  ||  v M [ • ,e]  ||  D , VzcM. 


Proof:  By  (12.11)(d), 


v M [*  »*.]  II  D = maXTreIT  *v  ” " minireIU  *v  ” * 


But  (B. 10)  and  (B.ll)  imply 


miVj^  fvM[Tt.e]}  < minrcn^  {vM[tt,z]}  < max^  ^ {vM[tt,z]} 


< max  „ r — M , 

— irell^  iv  [n,e] } 
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(B.13)  Proposition.  vM[i,zJ  - vM[i',z']  <_  ft  , 


V [1,2.]  > [i\z]eX[M] 

M 

Proof:  It  suffices  to  show  that  ||  v [*»js]||  ^ ^ ft  . Define  j to  be  the 
state  which  maximizes  v [ j ,€i] , and  let  ip*  denote  an  optimal  perceptive 
strategy  adapted  to  M,  constructed  according  to  (21.1)  for  Tr(0)“e^; 
i.e.  selects  inputs  optimally  on  the  basis  of  information  s(0)“j 

and  (xM(k)}  . Then,  by  (21.2), 

v(ej,e]  - ^^k£<0,X-1>  6kq<k) 

!7M[ej,zM(I)],  if  zM(I)^ess[M]) 

( 

vM[xM(i)],  if  zM(£)eess[M]  ) 


|s(0)-j}  - L(6,£)g[M] 


and,  for  any  tteIU  i 


v[TT,e]  _>  £leS  tt±  Ej,*  {Eke<o,J.-i>  6 q(k) 

( vM  [Tt,zM(I)J,  if  zM(T)*ess[M) 

♦ e1  - 

( vM[xM(£) ] , if  zM(£)eess[M] 


|s(0)-i}  - L(6,£)g[M) 
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Thus 


— M . j , — M 

v [eJ,e]  - v"  [Tr,e] 

1 L(B,I)Q 


vM  [eJ,zM(£)]  - vM[tt,zM(£)]  , 


+ 6 »jV  {< 


M .v. 


if  z_  (Z)te88  [M]  y | s(0)=j } 


0, 


othe rwise 


+ $ (1-tt  ) max 


^U.zJeXlM]  {vM[1‘^}  " min[i,z]ex[M]{vM[i^} 


1 L(6,£)q 


+ f'i  V { 


M - 


al*  (£)],  if  z”a)tfess[M] 

0 otherwise 


II  vM  [*,2M(£)]||d  |S(o)-j} 

+ BI(l-7rj)||7M[.,e]||D 

< L(6,I)Q  + B1  nja||  vM[.,e]||  D + BI(l-irj)||  vM[.,e]||  p 

< L(e,I)Q  + eV^  (1-1)  ] ||  v M [ • ,ej  ||  D . 
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But,  for  any  nell^  » there  is  an  input  word  ueU  such  that 


E1ES£  l(u)  V®*”-  ^ 

%ei  — 


Thus , for  any  1TenN  ’ 

vM[n,e]  > L(e,Ufi))Qmln 


+ e*(u)  ^leS  V 1 


V M [TT,zMa<u))  , if  zri(Jl(u))*ess[M] 


vM[xM(f.(u))],  otherwise 


i(0)“i,  u(0) . . . u(H(u))*u}  L(6,<Uu))g[M]  - L(6,£(u))g[M] 


> L(6,S.(u) )Q 


P((u,i)) ,e]} | s (0) } 


where  (B.ll)  was  used  to  obtain  the  second  inequality.  Thus: 


A-,e]||  D < »aV<oitp>  {L(0,k)Q  + 0k[L(6,)l)Q 
+ 0I[l-(l-p)(l-a)]  ||vM[*,e]||D 

< **\e<0tl  > {L(3,k+I)Q  + 0k+V-(l-p>(l-I)]||  vMf*,e]||D 
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which  Implies  ||  vM[«  ,e]  ||  < . 


t 


c.  A Bound  on  Pseudo-perceptive  Deterioration 

M ~ 

Let  v [i,i,z_]  denote  the  value  of  being  in  augmented  state  [i,z] 
while  believing  the  augmented  state  to  be  [i.zj,  where  i,iec.  Speci- 
fically, 

vM[i,i,z]  - q”(i,u*) 

+ 0 zyEY  Zjes  Pz 

- g[M],  £Eess[M]nz+(e^,e1)  (B.14) 


M 

where  u*  maximizes  (21.1)  in  the  evaluation  of  v [i,z].  Eqs  (21.1) 
and  (B.14)  may  also  be  written: 


M ~ 4 

V [i.z]  - T(e\z) 


q(u*)  + Eyey  P(y|u*)l 


r p“  (i , j . (u* ,y) ) 


"jeS 


L T(ei,z)P(y|u*)l  J 


evM[j,TM(z,(u*,y))] 


- g[M] 


vM[i,i,z]  - T(e1,*) 


q(u*)  + EyeYp(y|“*>i 


(B. 15) 
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P.(i,j,(u*y)) 


T(e1,z)P(y|a*)l 


3vM[j,TM(z,  (u*,y)) ] - g[M] 


Since  6[T(ei,z),T(e1,z)]  <_  atz.],  application  of  (13. A),  (2.13) 


and  (B. 11)  to  (B.15)  yields 


vM[i,z]  < ot[z][Q+Ffi] 


+ Ke1  ,z)  q(u*)  + EyeyP(y|u*)l 


p"(i,j,(u*,y)) 


T(ei,z)P(y|u*)l 


6vM[j,TM(z,(u*,y))]  - g[M] 


Combining  (B.16)  and  (B.17), 


M a,  M * , 

V [ i * z_]  - V [i,i,zj 


1 a[z][Q+en]  + T(ei,z)Ey£YP(y|u*)l 


PM(i,j,(u*,y))  P”(i,j,(u*,y)) 

z z 


A 

T(e1,z)P(y|u*)l  T(e\z)P(y  |u*)l 


BvM[£,T^ (£,  (u*,y)) ] 
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Define: 

M m 

F(z)  “ maxj  ~ v [i»i».5.]} 

Naturally 

vM[j,z]  “ vM[j,£] 

1 vM[j,£]  - vM[j,j,z] 

< F(z) 

Substituting  (B.19)  and  (B.20)  Into  (B.18), 

F(z)  £ maxlcC  max^aU]  [Q+Bfi] 

+ EyeY(T(e1,z)P(y|u)l) 

a[z(u,y)  - TM(z,(u,y))]6F(TM(z,(u,y))) 

If  M-Z  and  1-1 , then: 

F(z)  < a[z] [Q+BH] 

+ a[£]8a[Q+0fi] 

+ a[z]Baga[Q+$ft] 

+ • • • 

■ at.] 

l-6a 


(B.19) 


(B.20) 


(B. 21) 


(B. 22) 
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In  the  more  general  case,  multiple  step  versions  of  (B.18)  and  (B.20) 
are  constructed,  following  (14.19)  and  (A. 13),  to  obtain: 

F(z)  < max  ®ax  {£  ,.  , \ *(o[z'  .^(e1  ,z)P(£' ) 1) 

lee  (z^-1)*)  z'eZ(k-1)* 

B^-^alz  z’  - ^(z.z^latz'HQ+Bni  + E .(o[z'  ,*)T(e1,z)P(i,)l) 

z/eZK 

6kct(zz'  - Az,z')]F(TM(z,z'))}  (B.23) 

Finally,  note  that: 

vM[l,z]  - vM[l,i,z] 

_<  vM[i,z]  - vM [ 1 , 1 , zj 

+ vM[i,i,z]  - vM[i,zJ 

m /v  M a 

+ v [l.z]  - v”[l,i,z] 

< 2F(z)  (B. 24) 

d.  Proof  of  Part  (b) 

The  proof  of  part  (b)  Is  constructed  In  exactly  the  same  manner 
as  that  of  part  (a) , except  that  the  Incremental  deterioration  In  per- 
formance due  to  pseudo-perception,  given  by  (B.23),  Is  used  In  place  of 
the  Incremental  value  of  perception. 
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Consider  first  the  discounted  case.  Define  Y™  to  be  a strategy 

M 

which  selects  inputs  at  times  <0,m-l>  according  to  4>  and  the  re- 

M 

maining  inputs  according  to  \p  . Then 


to  ^min^M\  .M. 

g(B,Y  ) ” g(6.^  ) 


(B. 25) 


M. 


g(B,Y  ) - g(6,<t>  ) 


(B. 26) 


Following  (B.4),  and  using  (B.23),  (14.23),  and  the  convention 
£(b;a)“£  if  a<b. 


g(6,Ym)  - g^.Y®*1) 


< 


< 


< 


(1-6)  B^: 


m+1 


(l-B)6mE 


m+1 


(1-6) 6“e 


m+1 


(vM[zM(m)]  - vM[s“(z(m)),  xM(m)] 
(2F(zM(m))} 

{E*_0  6kot[zM(m)£(m;nrt-k)-zM(m+k)] 


a[zM(m+k)]}  2[Q+6fi] 

< <1-6)6"e  (r;.0Ska[j(«-tmln[M1:  ^k-t>ax[M])] 

Y 

a[z(m+k-JL  . [M] ; m+k) ] } 2[Q+Bfi] 

— min 

i (1-6)6^  ! <£0  e‘»[ic-(11,nl»l; 

_ L-IHltT 

m+k-fc  [M])]}ct  mln  2[Q+en] 

max 


J 
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< (1-B)8  {L(B,«.maxrM]-«-mln[M]) 


+ 6 “a*  min  L(8,«.)(l-a)  *} 


*minlM]U 


2 iQfgflJ 


Summing  as  in  (B.7)  completes  the  proof,  in  the  discounted  case. 


Take  the  limit  8+1  to  prove  (21.6)  (b)  in  the  undiscounted  case. 
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APPENDIX  C 

Listing  of  the  Computer  Program 


PL/ 1 OPTIMIZING  COMPILER  /•  DECLARATIONS  • / 

STMT  LEf  ET 


1 0 


3 1 0 


4 1 0 


3 1 0 


DCL  ^ 


DCL  1 


DCL  1 


DCL  1 




/•  •/ 

/•  RODEL  PARAMETERS  •/ 

/•  •/ 

/MMMMMMMMMtIMMMMtMMMMM/ 


MODEL  EXTERNAL, 


2 

N FIXED  BIN, 

/• 

NUMBER  OF  STATES 

•/ 

2 

HO  FIXED  BIN, 

/• 

NUMBER  OF  INPUTS 

•/ 

2 

NT  FIXED  BIN, 

/• 

NUMBER  OF  OUTPUTS 

•/ 

2 

NZ  FIXED  BIN, 

/• 

NUMBER  OR  SYMBOLS  IN  7 

•/ 

2 

(H,ESS_M)  FIXED  BIN, 

/• 

MEMORY  STAGES  COUNTER 

•/ 

2 

ERR  FLOAT  BIN, 

/• 

*NROP,  USUALLY  G.HIGH-H.LPM 

•/ 

2 

(MAX  M , M X X ESS  H)  FIXED  BIN, 

2 

MIM  ERR  FLOAT  BIN, 

/* 

USER-SPECIEI BD  BOUNDS 

•/ 

2 

EHT*EIXED  BIN, 

/• 

OUTPUT  FORMAT 

V 

2 

(P_PRCBS,  P_RMDS,  P_ZCCDE) 

POINTER, 

/• 

P01HTBPS  TO  STRUCT^*,  R*L''N 

•/ 

2 

P_POO*r  POINTER, 

/• 

R°CT  OF  MEMORY  TREE 

•/ 

2 

P*ESS_N°  DE_1  POINTER, 

/• 

START  op  gss  NODE  CHAT* 

•/ 

2 

G, 

3 (HIGH,  (XU)  FLOAT  BIN, 

/• 

BOUNDS  ON  G 

•/ 

3 STEPS  FIXED  BIN, 

/• 

DYNAMIC  PRCG  STEPS  COUNTE® 

•/ 

2 

H LIKE  MODEL. G, 

2 

P_RODE  POINTER, 

/• 

PRESENT  NODE 

• / 

2 

p’rel  POINTER, 

/• 

RELATIVE  NODE,  APG  TO  SCAN 

• / 

2 

p'rec  POINTER, 

/• 

RECURRENT  NOD* 

•/ 

2 

(LEf , MAI. LEV, LO, LOO)  FIXED 

BIN 

f 

/• 

LENGTH  OP  BRANCH  OP  P.NODf 

•/ 

2 

(U,V,Z)  FIXED  BIN; 

/• 

I N PUT/OUT PUT/ 10  PAIN* 

•/ 

STRUCT  ZCODE  BASED (P  ZC"DE) , 

/• 

TRANSLATES  (U,I)  TC  Z 

•/ 

2 

( NU 1 , NY  1 ) FIXED  BIN, 

2 

ZCODE (NU  REFER (NU 1) , NT  REFER  (NT  I) ) FIXED  MIN; 

STRUCT  PROBS  BASED (P  PMOBS)  , 

/• 

ORIGINAL  TRANS  PROBS 

• / 

2 

(11X2,12)  PIXID  III, 

2 

P«OBS(»X  HBPBR  (M22)  , ■ PPPBP(I2),»  REP'P(»2)I  P10«T  RIP) 

STRUCT  RMDS  BASED  (P  PHDS), 

/• 

ORIGIBAL  IMF  MENARDS  ARRAY 

• / 

2 

(»U3,»1)  PIXFD  BIN, 

2 REDS (MO  REFER  (NU3) , R REFER  (M3))  FLOAT  BIN; 


DCL0020 
DCL0030 
DCL0040 
DCLOOSC 
DCL0060 
DCL0070 
DC LOOM  0 
DCL004Q 
DCL0100 
DCL0110 
DCL0120 
DC  LO  1 30 
DCL0140 
DCL01S0 
DCL0160 
DCL0170 
DCLO10O 
DCL0190 
DC  L 0 2 0 0 
DCL0210 

DCL 0220 
DCL0230 
DCL0  24  0 
DCL02S0 
DC  LO  26  0 
DCL027O 
DCL0290 
DCL0290 
DC10300 
DCLO  3 1 0 
DCL032O 
DC! 0330 
DCL0140 
DCL0JSO 
DCL0360 
DCL 0 37  0 
DCT.0  3A0 
DC  LO  39  0 
DCL0400 
DC  L 0 4 1 0 
DCL0420 
DCL0430 
DCL  044  0 
"CL0450 
DCL0460 
DCL047O 
DCL0480 
DCL044Q 
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PL/I  OPTI MI  21 NG  COMPILER  /•  DECLARATIONS  •/ 


STMT  LEV  NT 


6 


0 


7 


0 




/•  •/ 

/•  MEMORY  TREE  SPEC  I PIC ATIO R •/ 

/•  V 


DCL  1 MODE  BASED (P_NCDE) , 

2 P_ESS_MODE~POIMT£R#  /•  POI MTS  TC  ESS  MODE,  BPLOB  •/ 

2 (P_TPF,P_BRANCHES)  POINTER,/*  POIM~  TO  SUbSTRUCTS  OP  RODE  •/ 

2 P.BACK  POINTER,  /•  IDENTIFIED  PREVIOUS  MODE  •/ 

2 Z^BACK  PIIED  BIN#  /•  IDS  BRANCH  CM  PREflOOS  MODE  •/ 

2 ( N 0 , Hi  0}  Ptlfefi  Rx»# 

2 ROMStJH  (N  REPER(MO))  PLOAT  BIN# 

/•  ROMSUR(I)  * SHB/J  TPM(t.J)  •/ 
2 TPH (N  REFER (MO)  , N REFER (MO)  ) PLOAT  BIN, 

/•  TRANS  PROBABILITY  MATRIX  •/ 

2 BP  ANCNLS  IB2  RE? PR  1*20)  ) , 

3 P_BRAMCH  POINTER,  /•  IDENTIFIES  NOOE  ALONG  BRANCH 

Z PROH  CURRENT  NODE  •/ 

3 E_FPANCH  BIT  ALIGN  ED ; /•  IS  BRANCH  1 A NODE  IN  Z*?  •/ 


DCL  1 ESS  NODE  BASEL  (P  ESS_NODE) , 

2 p’nEIT  ESS  NCDi  POINTER,  /•  NEXT  NODE  IN  ESS  NODE  CHAIR  •/ 
2 (NOO,NnOO,NZOO)  FIXED  BIN, 


2 (P  f G , P V H , P N,P^nG,P_PZ,P  02)  POINTER, 

/•  POINT  TO  SUBSTROCTS,  ■ ELON  • / 

2 REC,  /•  FLAGS  NHICH  ID  RFC  FEN  STS  •/ 

3 (TO, PROF, G, H)  BIT  ALIGNED, 

2 UH  FIXED  BIN,  /•  INPUT  - STEP  H •/ 

2 P NEXT2(NZ  REFER(NZOO))  POINTER, 

/*  NEXT  (FSS  ) NODE,  IF  NEXT 

I/O  PAIR  TS  THE  SUBSCRIPT  2 •/ 


2 EG (N  REFER  (N00) ) FLOAT  BIN,/*  RELATIff  VALUE  - STEP  G •/ 

2 f H ( N REFER (N00) ) FLOAT  BIN,/*  RELATIVE  VALUE  - STEP  H */ 

2 V(N  REFER  (NOO)I  FLOAT  BIN,  /•  VORESPACE  FOR  LFS  Of  OYN  PR  •/ 

2 UG(N  REFER (MOO))  FIXED  BIN,/*  OPTIMAL  INPUT  - STEP  G •/ 

2 P2 (N2  R1 PEN ( NZOO)  , N REFER (N00)  , N REFER  (N00) ) FLOAT  BIN, 

/•  TM  OF  AUGMENTED  SYSTEM  •/ 
2 02  (NO  REFER  (NUOO)  ,N  MPIR(ROO))  FLOAT  BIN; 

/*  INCREMENTAL  REMANDS  E*  R 

AUGMENTED  SYSTEM  */ 


DCL0S10 
DCL0S20 
DCL0630 
DCL0S90 
DCL0S50 
DCL0560 
DCL0S70 
DCL0S80 
DCL0S90 
DCL0600 
DCLOS10 
DCL0620 
DCL0630 
DCL0640 
DCL06S0 
DCL0660 
DCL0670 
DCL0680 
DCL06R0 
DCLO^OO 
DCL0710 
DCL0720 
DCL07  *0 
dclUtno 
DCL  07 SO 
nci07f o 
DCL0770 
DCL07B0 
DCL0790 
DCLOMOO 
OCLOF^O 
DCL0M20 
DC LOB  3 0 
DCl-3»«0 
DCLOMSO 
DCLOBBO 
DCL0B70 
DCLOFMO 
DCL0M90 

DCL0900 
DCtOS^O 
OCL0920 
OCL0930 
DC  1 09*  0 
DCL09R0 
DCLONf 0 
DCL0970 
DCL09F0 
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PL/I  CPTIHIZING  COh  I LFR  /•  CELLAR ATIONS  •/ 


STHT  LEV  NT 


OCLIO^O 

/• 

•/ 

DO  L 1 010 

/•  PAST  REFERENCE  OP  NODAL 

PA  P A HPT  ER  S */ 

DCL1020 

/• 

•/ 

DCL1 030 

DCL10UO 

D-'LiOSO 

8 

1 

0 

DCL 

(PP_?PR,*r_BPANCHES#PP_?G, PP  VH,PP_H,PP_ 

UG,PP_PZ, PP_ . Z,  ?P_PLAii| 

DC  L 1 05  0 

POINTER;  /•  PnI NT 

To  STPnCTUHES,  "pLON 

•/ 

DCL1O70 

DTL1080 
DCL 1r90 

9 

1 

0 

DCL 

B_TPB  (10000)  MS  ED  ( FP^TPH)  P11IT  Bin; 

DCL1 100 
DCL  1 1 1 o 

10 

1 

0 

DCL 

1 P BRANCHES (100001  B»SED(PP  BRANCHES) 

, 

DCL  1^20 

2 P P BRANCH  PClNTEP, 

Drl 1 110 

2 P_E_BRANCH  BIT  ALIGNED; 

DCL 1 140 
DCL1 ISO 

1 1 

1 

0 

DCL 

E_PG ( 10000)  BASFD  (FP_PG)  FLOAT  BIN; 

DCL 1 150 
DC  L 1 1 1 0 

12 

1 

0 

DCL 

P_VH(10000)  aASPD(PP.VH)  PLCAT  BIN; 

DCL 1 1*0 
DCL 1 190 

13 

1 

0 

DCL 

P_N(100P0)  BASED  (FP_H)  »LMT  BIN; 

DCL 1200 
DC  L 1 2 1 0 

14 

1 

0 

DCL 

*_UG<10000)  BACED(PP.OG)  *I&  ED  BIN; 

DCL 1220 

DC  L 1 2 1 0 

15 

1 

0 

DCL 

B_PZ ( 1000C)  BASED (FP_PZ)  BLOAT  BIN; 

DCL 1240 
DC I 12^0 

1b 

1 

0 

DCL 

E_OZ('0000)  BASED (EP_QZ)  FLOAT  BIN; 

DCL 1 250 
DCL 1 27 0 

17 

1 

0 

DCL 

FLAG ) 10000)  BASED (PP.PLAG)  FINED  PtN; 

/• 

GENERALLY  OVER  D9(*| 

•/ 

DCL 1280 
DCL 1290 

ie 

1 

0 

DCL 

D P_S KIP (10000)  BASED(PP_N)  PTXED  BIN; 

/• 

HASTINGS  SNIP,  oVFd 

«•/ 

DCL1 300 

19  1 0 

20  1 0 

21  10 




/•  */ 

/•  RISC  DECLARATICNS  •/ 

/•  •/ 




DC L (NOIL  t LI  MERC)  BUILTIN ; 

DCL  TIRING  FNTRUPTIED  PIN(J1,0)»; 

DC L 1 TIRE  EITERRAI,  /•  TIRES  IN  «*C/100 

2 tPREP.G.H  LIRIT)  PTEED  BIN(31,0); 


DCL  1 123 
DC  LI  330 
DCL  1 mo 
DC  LI  350 
DCL 1 360 
0CL1 3^0 
ftCll  mO 

«cLim 

DC  L 1 4 0 0 

•/  DCL  1410 
DC  11 42  0 


PL/I  OPTIMIZING  COM  PI  l PR  FPS.npT:  PPOC  OPTION5  (MAIN)  REORDER; 

SOURCE  LISTING 

S~BT  LEV  NT 

1 0 FPS  OPT:  TPrC  OPTIONS  (FAIN)  PEOPDER;  EAINOOIO 

■ATN0020 

1 0 SINCIUDF  rD1  (CCL)  ; BAIN0030 

4 1 0 DCL  (PPPP  G, SOLVE  G,TPEP  H, SOLVE  H , REPORT)  El T ENTRY;  BAINOO40 

E*INOOSO 

5 1 0 DCL  r?  FIXED  PIN;  /•  ITER*TION  N UEbFR  •/BAINOOfi" 

6 1 0 DCL  (TA'r.PAGE,Ir_PAGE)  FIXED  BIN;  /•  PAGP  COUNTERS  •/EATN0070 

7 1 0 DCL  TITLE  CHAR(32),  (I  ,J)  PlXFD  BIN,  B BTT,  *»  POINTER,  S *LD AT  BTN;EAIN0OH0 

H 1 0 DCL  LAP  CHAR(P2)  INIT((***  ||  (60) * - • ||  ));  E A TN0090 

0 1 0 DCL  T P CHAR  (6)  INTT(*TIBE  *');  HHN0100 

HAIN0110 


10 

1 

0 

CN  PNU?AGF (SYSPFINT)  BFGIN; 

FAIN0120 

1 1 

2 

0 

PUT  EDIT  (•  | | ' I ' * ' 1 * ) (CO  L ( 1 ) , A,C0l(86)  , A , PAGE, A, COL ( H6)  ,A)  ; 

n* INOI 30 

12 

i 

0 

PUT  EDIT  (TITLE)  (SKIP (6) , Cr L ( 14)  , A)  ; 

EATN71UA 

13 

2 

0 

TOT. PAGE  = TA  T.P  AGE ♦ 1 ; 

NAIN01S0 

14 

2 

0 

l9  TOT. PAGP  > 1 

FATN0160 

THEN  PUT  POTT  ('  PAGF*  ,TO’r_PAGP)  ( * (6 ) , A , P ( 3)|  ; 

T AIN 0170 

IS 

? 

0 

IF  IT>0 

4ATN01PT 

THEN  CO; 

EAT  N0190 

16 

2 

1 

I1”. PAGF  * lT.PAC.Fil  ; 

E A I NO  20  0 

1 7 

2 

1 

Pll~  PDIT (• TABLE* , IT*100  ♦ IT. PAGE)  (X  (6)  , A ,P  (6,  2,-2)  ) ; 

E ATN0210 

14 

2 

1 

IP  i?.pagr»i 

EA  T NO  220 

THEN  DC; 

E AIEO:^ 

14 

2 

2 

PUT  EDI'*'  (BAR  , *|  ITERATION*  , IT,  'NEB  **,B, 

“A  TNO:uT 

•FSS  EFE  *' , ESS.E, Tr,~IEF.PR?P,* | • , * | • ,* ( ' , * ( * , 

" AI  NO’S'l 

G. I ON , 1 < G <» ,G. HIGH, G. STEPS,  • ST  EPS*, TP, TINE. G,*  | ' , 

EIIN0260 

* )*,H.LON,*  < H <•  ,H.  HIGH,  H. STEPS,'  STt‘*>S*  ,TE,TIHP.  H, 

E A T NO  270 

* | * ,B AR) 

E A INO/nO 

(SKIP  (2)  ,2  (COL  (14)  ,A)  ,F(3)  ,X  (U)  ,A,P(3)  ,X  (3)  , A,F(1)  , 

E AINOtoO 

X (1) , A, P (6, 2,-2) , f (3) , A, COL (14) , A, CCL(7S) , A, 

E A I NO  70  7 

2 (CCL  (14)  , A,  P (b, 1)  ,A,F(8,1),*(Q)  , A , X ( 7 ) , A , F (6 , 2 , - 2)  , 

EAINO310 

X ( 3)  , A)  , COL  (14)  , A)  ; 

F AINO  32° 

EA I NO  3 70 

20 

2 

2 

ip  put* i 

EATNO  140 

THEN  PUT  E D I ? ( ' PC  ! fl  V (G)  V (H)  PPABS*) 

EA IN03SO 

(SKI P (2)  , COL  (14)  , A ) ; 

fl  ATN03'-0 

21 

2 

2 

ELSE.  PU*  FDIT(*PC  II*)  (CAL(14),A)J 

EATN0370 

22 

2 

£ 

PUT  EDIT ('NEE OPY  STATES' ) (CCL  (63), A)  J 

BAIN03«0 

EATE0140 

2 J 

« 

2 

FK  P ; 

EAI NOuOO 

24 

? 

1 

LO  « 0; 

EA  TNOU  10 

2S 

4! 

1 

END; 

BA  I NOu 20 

26 

2 

0 

ELSE  mi  LDIT(»nrOEI.Fn  srrcs*)  (COL(6)),A); 

E AIN043D 

L 7 

2 

0 

PND; 

BAIN04UO 
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PL/l  CPTINIZING  COUP  HER 


PPS_ OPT:  PROC  OPTIONS (HATH)  REORDER; 


ST  NT  LEV  NT 


28 

1 

0 

TITLE*** ; 

29 

1 

0 

BAI_LEV,  NAX_N,NAX  PSS_N , IT, PNT, 

30 

1 

0 

B , ESS_B  * l7 

31 

1 

0 

BIN_FRR*0.; 

32 

1 

0 

TIHE. LIBIT  * 3; 

33 

1 

0 

uET  LIST  (TITLE,  K, KU, Nt , NZ, PNt, 

34 

1 

0 

TIBE.LIBI1  * TIB  P. LI  BIT* 1 00 ; 

35 

1 

0 

SIGNAL  FMCPAGE (SYSPRINT) ; 

36 

1 

o 

PUT  PCIT(N,*  STATES  * , NU , * INPUTS 

37 

J9 

3 9 
40 

4 1 

42 

43 

44 
4b 

46 

47 
44 

49 

50 

5 1 
02 

53 

54 
SO 
S6 

07 

08 


/»»»»»**»»*«*•••«»«»•«•»*«**«•••«•*»•***/ 

/•  •/ 

/•  REA 0 HODPL  AMD  PPINT  TITI.C  PAGE  •/ 

/•  •/ 


0; 


.LrNI?#HIM_EPF,  PAX_B,  - A X_BS*-_N) 

,*  OUTPUTS' ,MZ, * I/O  PAIRS* , 
•TINE  UNIT:  *, TINE. LIBIT,  *NTN  ERR:  *,NIN^F®P, 

•BAX  NPB:*,NAX  H,»BAX  ESS  NEB:  * , BA  f _*BS_B) 

(SKIP(2)  ,C^L  (19)  ,4(F(4)  ,A)  ,SNIP(2)  , Cr  L ( 2 2 ) , A , P (6 , 2 .-?) 
COL  (0  3)  ,A,P(5,3)  , SK I P ( 2)  , COL  ( 2 2)  , A , P ( 4)  ,COL  ( 0 1 ) , » , F ( 4 ) ) 

0 ALLOCATE  ST RUCT^ZC^Dt , ST  *UcT_PRO DS,  STP *ICT_P MDS  , M"DE # ESS_NODrf ; 

0 2CODE  - 0; 

0 P_ROCT,P_FSS_NODE_1  * P_N0DE; 

0 p2bACK,p“kEIT^E?S^NOCF  * MULL; 

0 p'nextZ  * P_ prr~ ; 

0 p”TPN,FP_TrB  « ADDR  (TPB  (1 ,1) ) ; 

0 p“pPANCHES,  FP_E°AHCHPS  * ADD” ( PP A NC«*S ( 1 ) ) ; 

0 p”?G,FP.Vr,  * ADDS  (VG  (1)  ) ; 

0 P“VH#FP2VH  - ADDR(VH(1)); 

0 p“*  • ADDP(M<1) ) ; 

0 P~UG,  PP.UG  * A00R(UG(1))J 

0 P”PZ  » ADDR (PZ(1,1,1)>  ; 

o p'gz  * ADDR  (OZ (1 , 1) ) : 

0 PEC • G , PPC . H * *1*B; 

0 DO  1*1  TO  N*N; 

1 P TPB  (I)  *0; 

1 END ; 

0 CO  1*1  TO  N; 

1 F f G ( I)  ,F_VH (I)  » 0. ; 

1 P~UG(  I)  - 1; 

1 P_TPd<  (1-1) •*  ♦ I),  PC 8 SUB ( I ) ■ 1.J 

1 EM  0 ; 


BATMOUfO 
■A IN0470 
*A2  *9490 
B AIM0490 
batnoooo 

BAIM0S10 
NATN9070 
" AIM0030 
BAIM704O 
B AIHOSOO 
BAINO**  0 
•*  A!  M0070 
; BMN9Sq0 
B AT  MOOQO 
BAISOOO0 
BAIM06H 
BAT  NO*  20 
B ATM0*30 
f B A T M 064  0 
; BAI N0600 

BATM7660 
BAIN  3670 
NATM0640 
BAIN0670 

B AIM0700 
fl  A T ff  77  t 0 
B AIM0*»?O 
BA  IN07*0 
B AIV07U0 
NAIM  0700 
B»IM0760 
BAIM0770 
B AIM0789 
BAIN0790 
B AIM0OB0 
BATM0910 
BAIM0B20 
N A.  I NOB  7 0 

HATN0940 
B AIM0H07 
BAIMOBf ? 
BAIM0870 
BAIMORBO 
B A I HO  49  0 
BATM0900 
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PL/I  CPT I **I V.J  MG  CIPHER  PnS_OPT : pnoc  op<r  iqn^  (8  AI N)  REORDER; 


ST"  I LEV  NT 


/••«••••  •*•••*  ••••»*••*•••••••  ••••••••••/  ■ A TN0  9?o 

/•  PL  AC  P INPUT  PROPS  IN  PZ  •/  8AIN0910 

/•*••»*•*»••*•«*•••••*••«»*•••••*•••••••/  8 » tN094  0 

8ATNOQ69 


S9 

1 0 

FUT  FDIT (’TRANSITION  PPOB ABILITIES: •, 1 Z* , * (U,  Y)»,*P*) 

8 A I N 996  9 

(SKIP  ( 3) ,CCL(14)  , A, SKIP  , CnL  ( IS)  , A,  X (7)  , A,  X (9)  , A)  ; 

RAIN0970 

60 

0 

DC  Z*1  *0  NZ; 

■ AIN99R0 

6 1 

1 

IP  LINE MO (SYSP*1NT) ♦ 3» (N/10* 1)  "N  > OS 

8 A T NO 99  0 

THtN  SIGNAL  ENDPAGF  (SYSPPINT)  ; 

8 AT  N 1099 

6 2 

1 

PUT  tOIT  (Z)  (F  (16)  ) SKI  P (?)  ; 

8AIN1019 

SJ 

1 

GET.UY  PAIR: 

8ATN102'' 

GET  LIST (U) ; 

8 AIN  10  JO 

64 

1 

IP  U*0  THEN  GO?C  GFT_TP8 • 

8A I N 1 OU  7 

60 

1 

GET  LIST  (Y)  ; 

" ATN10C  9 

6 6 

1 

PUT  EDIT(U#Y)  (C?l  U8N  (22)  , 2 F ( 3)  ) ; 

n?  *N1 060 

b 7 

1 

ZCODE (U , Y)  = Z; 

8IIN107O 

bP 

1 

r.c  TO  GtT_UY— PA  I 9 ; 

RAIH^OHO 

69 

1 

GPT_TP8: 

8 » TN 1 090 

IP  LI  NFK<" (SYSTPINT)  ♦ (N/10»1)*N  > *>S 

8A  TNI  10-1 

THFN  SIGNAL  FNDPAGF (SYSPR'K?) ; 

8MN1H0 

70 

1 

R * 'O'D; 

8ATN1  120 

7 1 

1 

FP_PZ  - ADDS (P.PZ->*-PZ  ( (Z-1) *N*N  ♦ 1)); 

"ATNll^O 

7^ 

1 

pe”T=1  TC  N*Nj” 

8 A I N 1 1 4 0 

7 J 

2 

P PZ  (T)  * 0.  ; 

8AIN1 ISO 

74 

2 

END; 

mum  160 

7« 

1 

G**  LIST  ( (P_P7  (I)  C-'  1*1  TC  N *N ) ) ; 

4 AT  N1 170 

7b 

1 

DD  :*1  TC  nT 

8 A I N 1 180 

77 

2 

PUT  SKIP; 

81 T N1 199 

7 y 

rUT  ?DI‘r  ( (P.PZ  (!)  CO  J*(I-1)*N»1  TO  t *N ) ) (COL  (1b)  , S P(8,4)); 

4AIN12O0 

79 

2 

FND; 

8 AIN1210 

90 

1 

(v'  T * 1 TO  N*  N ; 

8AIN1320 

d 1 

2 

d ■ P | F_PZ (I) -*0.  ; 

4AIN12’0 

92 

2 

PNO; 

RATN124  9 

9 1 

1 

F.P_ePANCH (Z)  * R; 

8ATN12S0 

9 1» 

1 

p“p”bRANCH  (7)  * NULL; 

4AIN1260 

«S 

1 

indT 

"AIN1270 

/•  COPY  PZ  IN-'  PROP* 

•/RAIN12«0 

96 

0 

rr_vz  ■ p_pz ; 

1AIN1290 

67 

0 

P * ADO’’  (PRrRS  (1  , 1,1))  ; 

8ATN1 300 

88 

0 

IT  1*1  *G  N*N*NZ ; 

8 * i n i 31  n 

89 

1 

P->P  PZ (l)  * P rz  (I)  ; 

8AIN13O0 

90 

1 

end:” 

8 AT m no 

i 


PL/I  CP'-lnlZIMG  CCHPILER 


FPS_OPT:  PROC  OPTIONS  (HAIN) 


RFOPDER 


STH?  L P V NT 


9 1 
9 2 

93 

94 

95 
9f 

97 

98 

99 
100 
101 

102 

103 

104 

105 

106 
107 


0 

0 

1 

2 

2 

\ 

3 

4 
4 

3 

i 

1 

3 

3 

2 
1 

0 


/•  VERIFY  S«H/Y,J/  P/T,J/(Y|U)  * 1.  •/ 


P * *0»R; 

DO  1*1  TO  N; 

DO  0*1  TG  NO; 

S * 0. ; 

DC  f * 1 TO  NY; 

Z * ZCODE(OfY)  ; 

IF  Z-*0 

THEN  DC  J * (7-1) (1-1) •N^l  (Z- 1 ) *N • N» I • N ; 

s * s ♦ f_pz  (J)  ; 

END; 

END; 

IF  ABS(S-1.)  > IE-4  THIN  DC; 

POT  EDIT  (•  ERPOF;  TRANS.  PROPS.  00  NCT  SUP  7^  CNF  FOR  I S,,I# 
*.  o **.0)  <FKTP<2),A#F<3),A,F(3)); 

6 * • 1 • P ; 

END; 

END; 

END; 

IF  B THEN  Step; 


FATWl  ISO 
H M N 1 36  9 
N AlNl  I™ 
HA  I Nl  1®  9 
N A IN  1 3«9 
HATSUOO 
PATN1410 
PATN1420 

NATN1439 
If  T N 1 44  9 
NATN1USO 

NAIN1460 
N»  T N 1 4 7 0 
NAIN144? 
BA  IN  149  0 
HillTSOO 

"ATN1510 

HA  I N 1 52  0 
BATNlSn 
1ATN1S4" 
BATNlS^') 
B»T Vlef 0 
FATN1590 
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Pl/I  CPTiniZIMG  CCNPILEP  FPS_CPT:  PPOC  OPTIONS  (B AIN)  REr  &DER ; 

S7BT  LEV  NT 


1 0H 

1 

0 

/•  PLACE  INPUT  PENAPDS  IN  JZ  AMD  PHDS  •/ 

/•••••••••••••••••••••••••••••••••••••••/ 

It  LINFNP  (SYSPtINT)  ♦.♦(N/10*.)  *N  fl  > SS 

BAINlSn 
PAINlf^O 
B A I N 1 6 1 0 
BATNlf»?B 
BATN1630 

109 

1 

0 

THEN  SIGNAL  F N DP  AG  E (S  Y S P P ! NT ) ; 

PUT  EDIT  (*  INCrPHtNTM  t-  bn  ARDS:  * # • U* , *0') 

B AIN  IfcUO 
BAlNlf.50 

no 

1 

0 

(Sri P C 33  # x c 1 33  , A, SKIP, COL  (2 7) , A , X ( 1 0) , A)  ; 
PUT  SNIP  (2) ; 

BATNU50 

BATHlf-70 

ii  i 

1 

0 

G. HIGH  * - 1 F 1 0 ; 

B AIB16H0 

m 

1 

0 

G.LOV»  1 E 10 ; 

BAINlfBO 

113 

1 

0 

DC  !J*1  TC  NU; 

BAIN  1700 

114* 

1 

1 

pp  <jz  • ADT>b  { r_oz->r  o/||D-i>  •*♦>)) ; 

SAIN171 o 

ns 

1 

1 

GIT  LIST  ( (F  07.(1)  PC  1*1  TO  N ) ) ; 

BAIN  1720 

iif. 

1 

1 

PUT  EDTT(fl)  (COL  (25)  , F (3)  ) ; 

BAIN1730 

117 

1 

1 

PUT  FMT((P  QZ(I)  DC  1 = 1 TC  N))  (CO  LOIN  (35)  , 5 **  ( H # a ) > ; 

H»IN1740 

Hb 

i 

1 

Dr>  1*1  7)  N; 

hatnit^o 

119 

1 

2 

G. HIGH  = SAX  (ti. HIGH,?  07(1)); 

B A T N 1 7f 0 

120 

1 

2 

G.LOH  * BIN  (G. HIGH, P QZ(I))J 

BATN17T0 

12  1 

1 

2 

rNP; 

BAI M17R0 

122 

1 

1 

END ; 

BAIH1710 

12  3 

1 

0 

*P  Q7  « P qz: 

■AI N1  BOO 
4AIN1B10 

1 >4 

1 

0 

P « ADD**  (CMDS  (1,1)); 

BAIN1B20 

125 

1 

,9 

DC  T«1  rr  N*Nlf; 

BAIVlB-m 

12b 

1 

1 

r->p_yZ(l)  « ?.gl(!|  ; 

BA  INIBuO 

127 

i 

1 

tMD; 

■*IN1BS'' 

12H 

1 

0 

/•  BffC  PFELIBINA3IFS 

Mi R « G. HIGH  - G.LOH; 

BAINIBf-0 
• /S  A T V 1 970 
B AIN  1«°0 

129 

1 

0 

TP  BAX  B < = 0 THEN  BAX  B«10000; 

BAIN1 HB  3 

1 10 

1 

0 

if  BAX  ESS  B< * 0 THEN  BAX  ESS  B*1000; 

B ATB 1 900 

1 3 1 

1 

0 

IF  PBT-**0  l PB?*»*  1 

«A!N191 0 

Hi 

1 

1 

‘’■Hr*  Dr  J 

P’lT  FDIT(»**«  I NCC  RHECT  OUTPUT  FT  RB  AT  • , F-T  , • SPFCIPIFD  ••••) 

BAlNlB->0 

BAIN1930 

U3 

1 

1 

(SKIP, 1(10) ,A,P(4) , A) ; 

ST^P ; 

BAIN  1 94  0 
BAIBIB^O 

134* 

1 

1 

PHD; 

B A I N 1 4fi  0 
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PL/I  OPTIMIZING  COMPILER  PPSJ3PT:  PROC  OPTIONS  (M  AIN|  REORDER ; 

STMT  LEV  NT 


/.»••••••••••«••••••»•••••••••••••••••••••••••••••/  UHII’O 

/•  V rainioro 

/•  MAIN  SECT I PR  CP  TH t PROGRAM  •/  MAIN2000 

/•  •/  NAIR201O 

/•••••••••••••••*•••••••••••••••••••••••••••••*•••/  HAIR  2020 

MA  IN2O30 

13*  1 0 I.PCP:  RAIN20U0 

IT  * IT* 1 ; MAIN20S0 

136  1 0 IT  PAGE  - 0;  "AIN2060 

137  1 0 CaIl  TINING(TINE.PREP) : N A TR  207  0 

NAIN2040 

..«•*/  RAIN  200  0 

/•  •/  RAIN2100 

/•  SOLVE  POP  OPTIMAL  GAIN  0 •/  RATN2110 

/•  AND  OPTIRAL  VALUE  VG  •/  RAIN2120 

/•  (US*  VM  AS  INITIAL  GUPSS)  •/  NAIR2130 

/•  (LEAVE  SOLUTION  IN  BO*!!  VG  AND  VH)  •/  R1IN21R0 

/•  •/  NA I N2  ISO 

RATR2160 

MAIR2170 

110  1 0 CALL  SOLVE  G;  NAIN2100 

130  1 0 CALL  TIRING  (TINE. G)  i R>IN2'O0 

"AIR270O 

• •••/  SAIN!!'A 

/•  •/  RAIR2220 

/•  SOLVE  PCK  PEASIBLE  GAIN  ri  •/  RA'R2’»0 

/•  AND  CCPRPS PONDING  VALUE  VH  •/  MAIN2’40 

/•  (USING  VM  AS  INTIAl  G"ESS)  •/  NA*N22S0 

/•  •/  HAIR22B0 

/ R AIR  227  0 

RAIR22O0 

140  1 0 CALL  PPRP  H ; RAIN22O0 

lul  1 0 CALL  SOlvT  HJ  NAIn230O 

1R2  1 0 CALL  TIMING  (TINR.H)  ; N»IR2”0 

RAIN2  I’O 

l«3  10  CALL  PEPCST;  NAIR2”0 

RATR2140 

144  1 0 CALL  PREP_0;  RATR23SO 

1»S  1 0 GOTO  Loop”  HAIN2 1*0 

146  1 0 END;  -AIN2370 
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PL/I  OPTIMIZING  CONFILI*  PPEP_G:  PROC  REOPDER; 

SOURCE  LISTING 

ST BT  LEV  NT 


1 

0 

PP FP  G:  PROC  REORDER ; 

popcorn  o 

PRPG0P?P 

2 

1 

0 

(INCLUDE  CD1  (DCL)  ; 

PDPG70 10 

•••/ 

PRPGOPU'1 

/• 

•/ 

PPPGOORO 

/•  ADD  1fDE3  A3  SEglllPPD 

(FOLLCUING  R rC 

.«> 

•/ 

PPPGOObO 

/•  COPY  V IN TC  V 

•/ 

P«>Pr,0070 

/•  PPUNt  OUT  NODES  MHICH 

AP"  NO  LONGER 

ESS 

•/ 

PRPGOOSO 

/• 

•/ 

PRPOOO^O 

•••/ 

P p PC,  0 1 0 0 

PPPCOIIO 

u 

1 

0 

DCL  ADCNCCE  EXT  ENT»Y; 

PPPGO’?') 

5 

1 

0 

OCl  (Rtl  (0:  HU)  ,BZ  (0:  HZ)  , B,  b»)  BIT 

ALIGNED,  |P, PD, Pi)  POTNTEP, 

PPPGO  1 10 

(I.iZ.Z  PTRIIIG(0:1A*  t-F¥)| 

*TXKD  BIN; 

PPPGO’uO 

PPPGO  150 

/•  ADD  NEW 

NOOKS 

• 

/PPPG11N o 

fe 

1 

0 

R A * ' C*  R ; 

PPPGO 1 70 

7 

1 

0 

P Nf.DE.PI  • P PSR  NC-DE  Is 

OPPQO  1 >»0 

PP°GO  1 70 

0 

i 

3 

A LOCP: 

PPPGO 200 

IP  -.FtC.G  TH  PN  GCT^  HIP  A LC"P! 

ni»?r,02i0 

poor,02?/> 

4 

1 

0 

PU.OZ  * • O' P; 

PRPG023O 

10 

1 

0 

PP  if G * P MG; 

®°PG02«0 

1 1 

1 

0 

Dr“r*1  TO  N ; 

PpPOD?50 

14 

1 

1 

BII(P  HG(I))  « • 1 • B ; 

P?PG02*0 

1 J 

1 

1 

END; 

PRPG02T0 

ppor;O2«0 

14 

1 

0 

DC  U*1  TO  NO ; 

PPPGOJOO 

15 

1 

1 

If  DU(O) 

prp-.o  n" 

THEN  DO  r-1  TO  NY; 

pppgO’io 

1b 

1 

2 

PZ  (ZCCDt  (H,») ) • '''8; 

PP"GO  1?0 

17 

1 

2 

END; 

n»OG0  MO 

1b 

1 

1 

END; 

PP  POO  3U 
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PL/ 1 CPTIS1IZING  COflPTLPF  PREP.G:  PFOC  REORDER; 


STP1T  LFV  NT 


19 

1 0 

DC  ZZ  * 1 TC  NZ; 

®PPGC  360 

20 

1 1 

IP  PZ  (ZZ) 

PGO  3Tr 

THEN  DO; 

pWOrjf)  1) 

21 

1 2 

P#P0  = P NODE; 

PPPG0390 

2* 

1 2 

LEV  * - 1? 

PPP00400 

2 3 

1 2 

A.LGOP1 : 

P PPG  04  1 0 

LEV  * LPV41; 

PPPG0420 

24 

1 2 

Z.STPINCJ  (I  EV)  * P->  7.  __  B A C K ; 

n9PG  04  3 * 

25 

1 2 

P~*  P->  P_B ACK ; 

pnr*';044  0 

26 

1 2 

IP  P-.-NULL 

° P PGOU  4 0 

THEN  GOTO  A.LCOP1; 

P" POO 46  0 

PPP00470 

*7 

1 2 

Z. STRING  (LEV)  = ZZ; 

PPPG0440 

28 

1 2 

P~NODE  * F.RCCT; 

po  PG0U90 

29 

1 2 

A_LCOP2l 

P PPG  0 500 

FP.PRANCHFS  * P.^ANCHS** ; 

PPPGOSI 0 

30 

1 2 

Z « Z STRING  (LZV)  ; 

pwpG0*2° 

3 1 

1 2 

TF  ^pIe.PFANCH (Z) 

°F  PG04  10 

THEN  GOTO  CU".A ; 

°P  PG  054  0 

32 

1 2 

P * P_P_DPANCH  (7)  ; 

3 3 

1 2 

IP  P* NU LL 

P9DG146 0 

THEN  ; 

nnpr,OS7  0 

34 

1 3 

l p -HtC.G  THEN  G01C  CUT.A ; 

ocPG05QO 

35 

1 3 

PP.UG  * P.HGj 

P *>PG0S9D 

36 

1 3 

U * UH; 

P^PGOfOO 

37 

1 3 

DC  1*1  TC  N; 

PP  PG  06  ^ 0 

34 

1 4 

TE  PJJG(T)-*0  d FJ1G  (!)-•*  f» 

P?PG062O 

T H E N ” DO ; 

PPPG0630 

19 

1 5 

CALI  ADCNCD*; 

PF-PGO640 

40 

1 5 

BA  * *1*6; 

P P PG  064  0 

4 1 

1 5 

GOTO  OUT. A; 

POPGO660 

42 

1 5 

END; 

P n PG  p * 7 0 

4 3 

1 4 

END; 

popGOMP 

44 

1 3 

GOTO  CUT  A; 

PF  PG  0*-  40 

45 

1 3 

END; 

PPPG0700 

46 

1 2 

P.NCDE.  * P; 

PFPG071 0 

47 

1 2 

LEV  ■ LFV-1; 

PBPG07?0 

48 

1 2 

TP  LE V >*0 

PRPG07  30 

TH*N  GOTO  A.LOGP2 ; 

PPPGOTuO 

49 

1 2 

OUT. A: 

P«PG0'»5O 

P.NODP  ■ PO; 

P»*»G0760 

50 

1 2 

END; 

P°PGD7‘»/> 

51 

1 1 

END; 

P7PG0780 

5 « 

1 0 

END  A. LOOP: 

"$"00700 

p'NODP  • P NEIT.P3S.NC  Df ; 

°P  PGO  B>)  0 

5 J 

1 0 

IP  *»  N^ni-.*BniL' 

nQor.OHI  " 

TMIn'gOTO  A LOOP; 

pRPGO4?0 
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PL/I  OPTINIIING  COHPILER 


PBEP_GJ  PROC  REORDER ; 


STHT 

Iff 

MT 

/•  CHECK  J WAS  ANYTHING 

ADDED? 

• /PPPG084  0 

Su 

1 

0 

If  -BA 

PPPG0850 

THEN  DO; 

PRPG0860 

55 

1 

1 

SIGNAL  ENDPAGE (SYSPRINT)  ; 

PPPG0870 

56 

1 

1 

HIN  ERR  * 1. £10; 

PBPG08R0 

57 

1 

1 

POT”PDIT ( • •••  NO  nenOFY  STATES  ADDED  - AT  HOST  ONE  HOPE 

I*»RAT 

ION*  PPPG0890 

||»  WILL  BE  ALLOWED  ••••)  ( SNIP ( 2) , I ( 1 0) , A) ; 

PRPG0900 

58 

1 

1 

RETORN; 

J>RPG0910 

59 

1 

1 

END; 

PPPG0420 

/•  CLEAR  COT  ESS  NODE 

CHAIN 

•/PRPG0930 

60 

1 

0 

P NODE  ■ P_iSS_NODE_1j 

PRPG094  0 

61 

1 

0 

PRE*LOOP: 

PRPG0950 

p”iEl  - P.BODE; 

PRPG096  0 

62 

1 

0 

p”N0DE  » P NE1T  ESS  NODE; 

PRPG097  0 

63 

1 

0 

IP  P_NODE-*»P1  THEN  GOTO  PRE.LOCP; 

P«»PG09R0 

64 

1 

0 

GOTO”ENTfR_LOOP; 

PPPG099  0 

PP  PG 1 000 

65 

1 

0 

PPONE.LOCP; 

PPPG1010 

P PEL  * P NODE; 

PPPG1020 

66 

1 

0 

p“»ode  p2»t*?_zss_wct>ft; 

PKPG1 03  n 

67 

1 

0 

IP  P_NCDB*NULL  THIn'rFTOHN; 

PRPG1040 

PRPG1050 

68 

1 

0 

INTER.LOCP: 

PRPG1060 

PP  BRANCHES  « P_BRANCHES; 

PPPG1070 

69 

1 

0 

DC'Z-I  TO  NX; 

PPPG10P0 

70 

1 

1 

IP  P_E  BRANCH(Z)  6 P.P.BfANCH <2) ■NOLL 

PPPG1090 

THEN*  DO*. 

PPPG1100 

71 

1 

2 

PP  00  ■ P 0G;  r P fG  • P.fG;  PP.fH  • P.YH; 

PRPC-1 110 

74 

1 

2 

Do"l-1  TO*N; 

PPPG1120 

75 

1 

1 

IP  P DG(I)—0  THEN  P YH  (I)  • P.fG(I)  ; 

PPPG1 130 

76 

1 

3 

END;” 

PRPG1 140 

77 

1 

2 

GOTO  PRONE  LOOP; 

PRPG1 150 

7 8 

1 

2 

END; 

PRPG1160 

74 

1 

1 

END; 

PRPG1 170 

PRPG11R0 

80 

1 

0 

ESS.N  ■ ESS_H-1; 

PRPG1 190 

81 

1 

0 

P PEL  ->  P.BSS  NODE  ->  P_NE«T_E3S_NCDP  ■ P.NEXT^ESS^NODE ; 

PRPG1200 

82 

1 

0 

fRtP  tSS.NODt;* 

®PPG1210 

83 

1 

0 

P.ESS.NODf  • NOLL; 

PPPG1220 

84 

1 

0 

p”nODE  - P RPL ; 

PRPG1210 

85 

1 

0 

GOTO  PRONE^LCCP; 

PPPG1240 

86 

1 

0 

end; 

PNPG12S0 
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PL/I  CPTIHXZIIG  COHPILEF 


ADDNODE : PFOC  REORDER; 


SOURCE  LISTING 


STRT  LEW  NT 


1 

0 

ADDNODE:  PROC  REORDER; 

ADDN001 0 
A DDN0020 

2 

1 

0 

f INCLUDE  EDI (DCL) ; 

ADDN0U10 

ADDN0040 

ADDN00S0 

/•  •/ 

A DDN006 o 

/•  ADD  PPANCR  Z TO  RODE  P RODE  •/ 

ADDN0070 

/•  ALSO  ADD  OTHER  NODES  AS  REQUIRED  Tr  HAINTAIN  •/ 

ADDNOOBO 

/•  RECURSIVE  PROPERTIES  OE  THE  HEHORT  SET  •/ 

ADDN0090 

/•  •/ 

A DDNO 1 n0 

A DDR  0 1 1 0 
ADDN0120 

4 

1 

0 

CCL  SCAR  If T FRTRf; 

ADDN01 10 
ADDNO 140 

5 

1 

0 

DCL  Z ADD  EIIFD  BIN  TNIT(Z),  PO  POINTER  IN7T(P  NODE) ; 

AOPNOISO 

/•  REGISTERS  TO  SAVE  INITIAL 

ADDN0140 

VALUES  OE  Z AND  P NC  D P 

•/AODN0170 

ADDNO 140 

4 

1 

0 

DCL  R(R)  FLOAT  BIN;  /•  R^N  SUE  OF  NEW  TPR 

•/ADPN0190 

7 

1 

0 

DCL  l STRING (0:HAI  LFV)  FIXED  BIN; 

APTN0230 

M 

1 

0 

DCL  P_NER (OlNAI.LEV)  POINTER; 

AOD4021 0 
APDN022O 

9 

1 

0 

DCL  (S,SV,E)  FLOAT  B I N , (I , 1 1 , J , R # UU)  FIXED  BIN,  (B,BP)  BI^  ALIGNED 

, AODR0230 

(f, PI  , P2, FP  PROPS, FP  R RDS, EP  T?H2)  POINTER; 

A DDN0243 

10 

1 

0 

FP  PNDS  ■ A DDF (PROS (1,1))  ; 

APDN02S0 

ADPN02^O 

/•  PILL  IN  Z STRING  NITH 

ADDNO  27 0 

DESCRIPTION  FOR  P NODF 

•/APDN02BO 

1 1 

1 

0 

Z STRING  (0)  * Z ADD; 

ADPN0290 

12 

1 

0 

DO  1*1  TO  RAX  LEV; 

ADDNO  100 

13 

1 

1 

PI  • P BACK; 

ADDNO 110 

14 

1 

1 

IE  PI-NULL 

ADDNO  320 

THEN  GOTO  out; 

ADDN0330 

IS 

1 

1 

Z STPING(I)  • Z BACK; 

ADDNO  340 

16 

1 

1 

P NODE  • PI; 

ADDNO  150 

17 

1 

1 

END; 

A PDNO  If, 0 

10 

1 

0 

OUT: 

APDNO 170 

n AX  LEV  • H AX  (BAX  LFV, I)  ; 

A DON  0 IB  o 

19 

1 

0 

LEV, LO  • I; 

ADDNO  390 
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pl/1  optimizing  cohpileb 


A ODNODi : IBOC  REORDER; 


STHT 

LEI 

NT 

/•  THIS  LOOP  ADOS  BRANCH  Z_ADD 

ADDN0410 

TO  PO  6 UNTQOE  PPBCFDPNTS 

•/ADDN0420 

20 

1 

0 

LCOPIs 

ADDN0410 

Li V * LEV-1; 

ADDN0440 

ADDN04SO 

/•  PIND  P_NCDE  P^R  GIVEN  Z_STP 

•/ADDN0U60 

<1 

1 

0 

P_NOCE  « P_ROOT; 

A DDN9470 

22 

1 

0 

DO  I * LEV*TO  0 BY  -1; 

ADON0U90 

23 

1 

1 

P « P NODE; 

ADDN0490 

24 

1 

1 

P_NODi  » P_BRANCHES->P_?_BRANCH (Z_STPING (I)  ) ; 

ADDN0500 

25 

1 

1 

END; 

ADDN0510 

ADDN0520 

26 

1 

0 

IP  P NODP—NOLL 

A DDN0510 

THEn'gOTO  NO_HCPE_ADC; 

A0DN0S40 

ADDN0550 

/•  ALIOCATT  NPN  NODE 

•/ADONOSf 0 

27 

1 

0 

ALLrC  NODE, ESS_NCDE; 

A DDN957  0 
AnDNOSRO 

/•  LINK  TO  OL"  NODE 

•/ADDN^RO 

2« 

1 

0 

Z^BACN  * 7_ADD; 

A DON 0 60 0 

29 

1 

0 

p'bacr  • P? 

ADDNORIO 

30 

1 

0 

P->P_BPANCHES->P_P_BPANCH  (Z.ADD)  « i>_N"DE; 

A D^N  0*2* 

ADDN'HIO 

/•  PLACE  NEU  NODE  A*  START  0* 

ADDN06«0 

*SS  NODE  CHAIN 

•/APDNOf^O 

31 

1 

0 

P_NEIT_ESS_NODE  - P.PSS.NODE. 1 ; 

A ODNOf 60 

32 

1 

0 

P^fcPN  (LEV) 7p_PSS_NODE_1  ■ P_NODE; 

ADDN06’*'' 
A DDN06RP 

33 

1 

0 

P TPH  • ADDS  (TPH  (1,1) ) ; 

ADDN0670 

34 

1 

0 

PP  BPANCHES,P_ BRANCHES  « ADDP ( BP ANCHFP  ( 1) ) J 

AODNO7P0 

35 

1 

0 

P_?G  • ACDR  (VG  (1) ) ; 

ADDN0710 

36 

1 

0 

PP.VH,P,VH  • ADDP  (VH  (1) ) ; 

ADnN07->0 

37 

1 

0 

P_N  - ADDA  (4(1))  J 

A nPN  O'*  ’ 0 

30 

1 

0 

PP  UG,P  OG  • ADDP  (IIG  (1) ) ; 

AnnN07uo 

39 

1 

0 

P_PZ  * A DDR  (P7  (1,1,1))  ; 

ADnNOP^O 

40 

1 

0 

PP_QP,P_0*  * ADDP (02  (1, 1) ) J 

ADOROTf  * 

41 

1 

0 

PEC. G, PEC. H • • 0*  B ; 

ADON9770 

/•  OPDATE  BPB'-'PY  COUNTERS 

•/ADDN0“7  0 

42 

1 

0 

B • B ♦ 1 ; 

APDNO^on 

43 

1 

0 

I3S_B  • ESS.BRl; 

APD40R0D 

A ’'DNO R7 0 

44 

1 

0 

P^BCDl  ■ P; 

addnor:o 

A 
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PL /X  OPTIMIZING  COMPILER  A DDNCDfc:  PRCC  F RPRDER ; 


n T 

IFV 

NT 

/•  COBPUTF  TPS  # V H A 4 D gZ 

•/ADD*oo40 

/•  PRESET  UO  *C  SHON  F(I) 

> 0?  • /A  DD*  0 R5  0 

us 

1 

0 

PF_VG  * P_VG ; 

A DDNOR6 0 

4b 

1 

0 

DO~T  * 1TP  H; 

ADnN0°7Q 

u 7 

1 

1 

SV  , • ( 1)  « 0.  ; 

A DC*09U  0 

4 8 

1 

1 

PP_TPS  * A DDR (P_E^?_NODF_1->P  TPN->F_TPS  ( (I  - 1 ) *N ♦ 1 ) ) ; 

ADD*0°7  3 

49 

1 

1 

Pp'pZ  = A DD  P (PROBS  (Z  ADdTi.I))"; 

A DDN0900 

SO 

1 

1 

DO* J « 1 TO  N; 

A DON  3910 

SI 

1 

2 

S * 0.  ; 

ADD*0 9?P 

S2 

1 

2 

II  * 1; 

ADDN0910 

S 3 

1 

2 

PP_t-PP2  z ADDR  (P_TPA->F_TPN  (J) ) ; 

• DOS 0 74  0 

S4 

1 

2 

DO~K  * 1 TO  N; 

A 004095'' 

ss 

1 

3 

K * P_PZ(K)  • PP_TPS2->*_TP« (II) ; 

ADL'479«.0 

5b 

1 

3 

II  * T I ♦ N ; 

» DP*0970 

S 7 

t 

J 

S - S ♦ P; 

A 00*094  0 

S3 

1 

3 

'»  * j»  ♦ £ • V G ( » | ; 

AOD40140 

S9 

1 

3 

END; 

A "DN 1 000 

60 

1 

2 

P_TP,»|J)  * 5; 

A DUN  1 OIQ 

6 1 

1 

i 

Pill  * «(!)  ♦ S; 

ton* 1020 

6 2 

1 

2 

END; 

Ano4l0]0 

6 3 

1 

1 

P_PSS  NODS_1->RC'NSU"  (I)  * R (I)  ; 

A *>DN  1 

64 

1 

1 

ip  a (T) >o 

A^D* 1050 

THEN  Dr  ; 

1 ODS 1 050 

6 S 

1 

2 

P_OG(I)  ■ 1; 

ADDN10*»0 

66 

1 

2 

P"f H (!)  * SV/R  (I)  ; 

ADDNl o«0 

67 

1 

2 

MU  » 0; 

A D04 1 090 

f 0 

1 

2 

DC  U*1  TC  Nil; 

ADDNl  100 

69 

1 

3 

5*0.; 

ADD41 1 10 

70 

1 

3 

O'*  .1*1  TC  N; 

ADD* 1 120 

7 1 

1 

4 

S « S ♦ P.TPP(J)  • PP_PNDS->P_OZ(UU*J) ; 

AD0N1 130 

7, 

1 

4 

END ; 

ADD* 1 '40 

73 

1 

3 

P_gz ( n t j ♦ i ) * S/P (1) ; 

ADDN1 150 

74 

1 

3 

nij  « UU*N; 

ADDNl 160 

75 

1 

3 

END; 

ADD* 1 170 

7b 

1 

2 

F.NC ; 

ADDNl 1«P 

77 

1 

1 

ELSE  P JO  (I)  • 0; 

ADD* 1 190 

70 

1 

1 

END; 

ADDN120O 

Pl/I  CPT I 817  I MG  COBPILPS 


ADPNCDfc : PPCC  PEGBDEP 


ST8T  LEV  NT 


/•  COBPHTF  *_BPANCH, P_ BRA  NCR  •/ADDM1220 


79 

1 

0 

DC  1*1  TO  MZ; 

ADDN1230 

80 

1 

1 

PP  PZ  • ADDP (PPOBS  (Z,  1,1)); 

ADDN1240 

8 1 

1 

1 

P P PP INCH  (Z)  - NULL; 

ADDN12S0 

82 

1 

1 

p“E88_MCDP_1->P_5SS-l,0nE->P_MEITZ  (Z)  » MOLL; 

ADDN1260 

8 3 

1 

1 

DC  1*1  TO  N ; 

ADDN1270 

84 

1 

2 

IP  • (T) >0 

THFM  DC  J*1  TC  N ; 

ADDN1 280 
A DDN 1 29  0 

8S 

1 

3 

IP  P_PZ  ( (J- 1) PNPI)  > 0 
T8EM*”dC  ; 

ADDN  1 300 
ADDN1 3^0 

Bb 

1 

4 

P_P_BPANCH  (Z)  « • 1 * B ; 

» DON  1 3 2 

87 

1 

4 

GOTO  MEXT_Z; 

A DDN 1 ivo 

88 

1 

4 

END; 

ADDN1  ’40 

89 

1 

J 

END; 

A DDN 1 ISO 

80 

1 

2 

PND; 

ADDN1 3S  0 

91 

1 

1 

P P RPANCH(Z)  * *0*0; 

i ODN 1 170 

92 

1 

1 

NMT_Z 

END; 

A DDN 1 390 
ADDN1 39A 

9 1 

1 

0 

GOTO  LT"P1; 

ADDN1400 
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PL/I  CPTIHIZING  COMPILER 


ADDNCDE:  PPOC  REORDER ; 


STHT 

LFV 

NT 

94 

1 

0 

NO_BORE_ADD: 

A DTNl 42  ° 

P_PEl~=  P_N0  DE ; 

ADDN'410 

9 5 

1 

0 

LOO  = LFV; 

ADDN1 440 

96 

1 

0 

B * • 1 • B; 

A DON  1 450 

ADDS1 4f 0 

97 

1 

0 

DO  LEV  = L00*1  TO  LO-1; 

A0DN1479 

98 

1 

1 

Z * 1; 

A 0031480 

99 

1 

1 

P RODE  = P NEW  (LEV)  ; 

A PRN1440 

100 

1 

1 

FP_UG  * PJ?G;  PP_TPH  = P_TPB; 

apd*j  1500 

102 

1 

1 

CAL l GPT_P_PZ ; 

A pdri 510 

103 

1 

1 

DO  Z =2  TO  NZ ; 

ADDN1520 

104 

1 

2 

CALL  GET_PZ; 

A DDN 1 5 1 n 

105 

1 

2 

END; 

A DDN  T 54  0 

106 

1 

1 

END; 

ADON15S0 

A DON  1560 

107 

1 

0 

LEV  * LOO; 

A DON  1 5^0 

108 

1 

0 

P_RCDE  = P_REL; 

A PDN1S9P 

109 

1 

0 

r”PFL  = P_ROp'T ; 

• DOR  1590 

110 

1 

0 

y”*  z spring  (lev* ij ; 

ADDN16R0 

11 1 

1 

0 

F 2 * P_FSS_N0Dt_1 ; 

A DnV  1 6 ' 0 

112 

1 

0 

B * »0*R; 

ftPDDt<  9 

113 

1 

0 

IP  P_ZSS_NODE=NHLL  THEN  GOTO  NEXT_ 

SCAN  ; 

A DON 1 6 ' 0 

AD^VI 64  0 

114 

1 

0 

LCCP2 ; 

AJVDR161 

IP  P_NrX?Z  (Y)  =NULL  ’’'HEN  GOTO  NPXT_ 

SCAN  ; 

AOHN16PO 

115 

1 

0 

P*P  NODE; 

1 000167 0 

116 

1 

0 

DC  T = 0 to  LEV; 

A DIN 1 0 

117 

1 

1 

Z 5TPING(r)  * P-  >Z__  PACK  ; 

A DDK  1 6<*0 

118 

1 

1 

F"»  P->  P_BACK; 

ADON17O0 

119 

1 

1 

END; 

A DPR  1710 

'20 

1 

0 

PI  » P REL; 

AIDS 11?0 

121 

1 

0 

Z * Y ; ” 

A PDv 1 7 ’ n 

122 

1 

0 

FPjic.  • pjig;  fp_zph  * P_ tpn; 

AOONl 74  0 

124 

1 

0 

CALL  GFT_R_PZ ; 

A |»flO  1 760 

AODN  1 7fc  0 

125 

1 

0 

NtXT^SCAN: 

A DDN 1 77Q 

CAtL  SCAN; 

A0P41 700 

126 

1 

0 

IP  P NO  DE  NOLL 

ATDN1 790 

tmER*GOTO  L00F2; 

AD’jRI  POO 

APDN1810 

127 

1 

0 

PI N IS  HE  D J 

/•  PESTORF  CALI ING  Z,P_N^D° 

•/AODNl 820 

P_RODE  * PO; 

ADDN1 810 

128 

1 

0 

Z‘.  Z_ADD; 

AODNl 64 7 

129 

1 

0 

RPTfTRN; 

ADDN 1850 
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PL/I  OPTIMIZING  COMPILER  APDNODE:  PROC  PEOBDER; 


STMT 

LEf 

NT 

130 

1 

0 

GET  R PZ:  PROC;  /•  THIS  ENTRY  COMPOTES  R (•)  EIBST 

*/  ADDN18R0 

i 3 1 

2 

0 

DO  1*1  TC  N; 

ADDN1RR0 

132 

2 

1 

B (1)  * ROVSUH  (I) « 

ADDN1490 

133 

2 

1 

END; 

A DON  1 900 

ADDN1910 

134 

2 

0 

GET  PZ;  ENTRY;  /•  COMPUTE  P NPXTZ,PZ 

•/ADDN1920 

ns 

2 

0 

p".  p_R^OT->P_BRANCHPS->P_P_BRANCH  (Z)  ; 

ADDNIO’O 

AODN1940 

1 3b 

2 

0 

IP  P*NULL 

ADDN19S0 

THPN  GOTO  COHP2; 

ADDN1960 

ADDN1 970 

137 

2 

0 

IP  B 

ADPN1940 

THEN  DO; 

ADDN1990 

1 38 

? 

1 

P2  « NULL; 

ADDN2909 

1 39 

2 

1 

DO  I - LBV  TO  0 BY  -1; 

A0DN201 0 

140 

2 

2 

T"P: 

ADDN2020 

FT  BPANCWfS  « ACOB  (P->P_F»ANCHES->P_P_BPAAICW  (Z^STPIBG  (If  ))  ; 

ADDN2030 

14  1 

2 

2 

P1~*  P_P_  BRANCH  (1)  ; 

ADDN2040 

142 

2 

2 

IP  -P.B. BRANCH (1) 

ADDN20S0 

THPN  PPTIJRN; 

ADDK2060 

ADDN2970 

143 

2 

2 

IP  PI  * NULL 

ADDN20«0 

THEN  DO; 

ADDN2090 

144 

2 

3 

P 2 • P; 

ADDN2100 

1 4 S 

2 

i 

P - P_POOT; 

ADDN2110 

146 

2 

3 

GOTO  TOP; 

ADPN2120 

147 

2 

3 

PNC; 

ADDN2170 

140 

2 

2 

P « PI; 

A ODN  214  0 

149 

2 

2 

END; 

A DDN  2 1 SO 

A DON  2 IS  0 

ISO 

2 

1 

IP  P2  - NULL 

ADDN2  170 

THEN  DO; 

A PDN  ? 1 fl  0 

1S1 

2 

2 

P2  - PI; 

ADDN  2 10  0 

1S2 

2 

2 

PI  ■ P po^T ; 

ADDN2200 

1S3 

2 

2 

END; 

ADDN2710 

1 S 4 

2 

1 

END; 

ADDN2220 
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PL/I  OPTIHIZIBG  COHPILER 


ADDBODE:  PBOC  BBOBDBB; 


STHT  LET  BT 


/•  CON  PUT P P* 

0HEPE  P 1 # P2  ARE 

A PDB2240 

Z (P  BCDE)  ||Z  * Z (PI)  | | Z (P2) 

A DDH?  250 

P BEITZ(Z) 

- P 2 

•/ADDH2260 

155 

2 

0 

IP  P_BEITZ (Z) -P2 

A DDB  2 27  0 

THEB~FETORB; 

ADDH2280 

156 

2 

0 

P_B  EITZ  (Z)  - P2 ; 

ADDW2290 

157 

2 

0 

DO  J-1  TO  B; 

ADDB2  300 

150 

2 

1 

S - 0.; 

ADDN  2 319 

159 

2 

1 

EP_TPH  • ADDS (PI- VP_TPH->F  T PN ( J) ) ; 

ADDW2  320 

160 

2 

1 

PP'PZ  « A DDB (P  P?  >P”PZ  ( (Z-T) *B*B*J) ) : 

A DDB 2 330 

16  1 

2 

1 

S * P2  ->  BOBSUH  (J)  ; 

ADDH2  340 

162 

2 

1 

II  - 1; 

A DDH  2 3*1 0 

163 

2 

1 

DO  1-1  TO  w ; 

ADDN2  36  0 

16  4 

2 

2 

IF  P OG  (I ) — 0 

ADDB2370 

THE»”P  PZ(II)  » P T P0  (II)  • S / MI); 

ADDB2  380 

165 

2 

2 

II  - II+B; 

ADD42  300 

166 

2 

2 

END; 

ADDB2409 

167 

2 

1 

EBD; 

A DPB  2 4 1 0 

160 

2 

0 

RETOPB ; 

ADDB2420 

/•  CO0PUTP  PZ 

WHERE 

A DDMJt  3 0 

P NEITZ(Z) 

• P_ROOT 

•/AD0N2440 

169 

2 

0 

COBP2: 

A DDR2450 

PB  « •0»B: 

ADD 4 2460 

170 

2 

0 

IP  P_KEZTZ  (Z) -P_BOCT 

ADDB2470 

THEB~RETUPB ; 

ADDH2490 

171 

2 

0 

P.BEITZ  (Z) -P_ROOT; 

ADDB2490 

172 

2 

0 

DO  1*1  TO  B ; 

A DDB  2500 

173 

2 

1 

IP  P UG(I)— 0 

ADON2S1 0 

THEB  DO; 

A DDB 2 520 

174 

2 

2 

EP_TPH  - ADDR (P_TPH->E_TPH  ( (1-1) *B» 1) ) ; 

ADOB2530 

175 

2 

2 

PP^PZ  • ADDR(P_PZ->P_PZ  ( (Z-1) • ■•B»  (1-1) *B»1) ) ; 

A DDR  254  0 

176 

2 

2 

DO  J-1  TC  B; 

ADDM2550 

177 

2 

) 

PP  PIC  BS  - ADDS (PBOBS  (Z, 1 ,J) ) J 

ADD42560 

170 

2 

3 

S - O.j 

ADDW2570 

179 

2 

3 

II-1 ; 

ADDB2580 

100 

2 

1 

DO  R»1  TO  B; 

ADDB2590 

181 

2 

4 

S ■ S ♦ P TPH(K)  • PP  PROBS->P  Pi(II); 

ADDR2600 

102 

2 

4 

II  - II#B; 

A DDB  26 1 0 

103 

2 

4 

IBD ; 

ADDN2620 

104 

2 

3 

f.PMJ)  - s/B(i)  j 

A DDB  26  3 0 

105 

2 

3 

BB  - BB{S>0i 

ADDB264  0 

106 

2 

3 

EBD; 

ADDB2660 

107 

2 

2 

EBD; 

AODB2660 

180 

2 

1 

EBD; 

AODB2670 

189 

2 

0 

IP  -BB  THEB  P.BIXTZ  (Z) -BULL; 

ADDB2680 

190 

2 

0 

RETOM  ; 

ADDR2690 

19  1 

2 

0 

EBD; 

A DDB  2700 

If  2 

1 

0 

EBD; 

ADDB2710 
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PL/I  CPTIHIIING  COHPILER 


PR  BP_H  : P80C  REORDER; 


SOURCE  LISTING 


STHT 

LEV 

NT 

1 

0 

PRFP  H:  PPOC  RECPDEP; 

PPPH0010 

"RPH002O 

2 

1 

0 

f INCLUDE  CCI(DCL); 

PPPH001C 

PPPH0040 

PPPH0050 

/• 

•/ 

PR»»H0O60 

/•  CCRPUTE  UR  AND  REC.  • 

•/ 

P°  PR  007  0 

/• 

•/ 

PPPH00R0 

MMMMM.MMMMMMM.t 

PBPH0090 

PBPH0100 

4 

1 

0 

DCL  ( HU  (0  : NU)  ,RZ(0:NZ)  , B , BB)  BIT  ALIGNED,  PP  POINTER, 

PBPHO 110 

<S,I)  FLOAT  PIN,  I FIXED  BIN; 

PPPR0120 
PRPHO 1 10 

/•  STEPO  CONFUTE  UH  * POST  LIKELY  CPTIRAL  INPUT 

•/  P R PH  0 14  0 

/•  P RFC  * LIKELY  G-PECURPEN?  N ADE 

•/  PBPHO ISO 

PP  PRO  16  0 

1 

0 

P NODE,  P REC  * P FSS  NODE  1; 

PBPH0110 

6 

1 

0 

LOOPO: 

PPPH0180 

FP  UG  * P tJG; 

PPPH01QO 

7 

1 

0 

T ■ - 1 . ; 

PRPH0200 

0 

1 

0 

DC  U»1  TO  NU; 

POPH0210 

9 

1 

S»0.  ; 

P °PHO  220 

10 

1 

1 

DO  1*1  TO  N; 

PP  PH  0 2 3 0 

1 1 

1 

2 

IP  F UG  (I)  • n 

PPPH0240 

THEN  S * S ♦ RCNSUR  (I)  ; 

PBPH02SO 

12 

1 

2 

END; 

PRPH0260 

1 3 

1 

1 

IF  S>TOE-4 

PP  PH  027  0 

THEN  DO; 

PRPH02R0 

14 

1 

2 

UH  ■ 0; 

PRPH0290 

15 

1 

2 

T ■ S{ 

PRPHO 100 

16 

1 

2 

END; 

P»>PHO  310 

17 

1 

1 

END; 

PPPHO  120 
PPPHO  no 

1 R 

1 

0 

IF  REC.H  THEN  P PEC  ■ P NCOE; 

PPPHO  340 

19 

1 

0 

RFC. 3,  RFC.H  • '0*0; 

PPPHO  160 
PPPHO  If  0 

20 

21 

1 

1 

0 

0 

P RODE  • P.NBXT  ESS_NODE; 
IF  P NODE  NfTLL 

PPPHO J70 
PRPHO  3°0 

THEN  GOTO  LOCPO; 

PPPHO  JRO 
PPPH0400 

22 

1 

0 

PB  • * 1 • B ; /•  FIRST  PASS  FINDS 

RFC.G 

•/PPPH041 0 
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PL/I  CPTIHITIRG  COfIPILP P PREP_H:  PROC  REORDER; 


STHT 

LEV 

NT 

/•  STBP1  SET  P PEL  * LT KELT  FECUPREKT  PbOE  »«D  SET  PTO  = 1 

•/  PPPHO<*30 

* 

popHOUUO 

23 

1 

0 

STEP1 : 

PP PH  0450 

P NODE  * P ESS  NODE  1; 

POPH0460 

“ - - - 

P R PH  0 47  0 

24 

1 

0 

LOOP1: 

PBPH04«0 

P REC  * P NODE; 

PPPH  044  0 

25 

1 

0 

PEC. TO  * 70*B; 

PSPH0500 

26 

1 

0 

P.NOPE  « P_NEIT_ESS_NODE; 

PPPH051 0 

27 

1 

0 

IP  P_NODE--vnLL~ 

PRPH0520 

THF»“gCTC  LOOP f; 

PRPH051 0 

P P PH  0 54  0 

/•  STEP2  SE*  REC.PROR  » 0 

•/  P°PH0550 

PRPH0560 

28 

1 

0 

STIP2; 

PRPH0570 

P NODE  « P_ESS_NODE-1 ; 

P*»PH05H0 

24 

1 

0 

LCOP2 ; 

PRPH0540 

RPC. PROS  » ' 0 • B ; 

"RPH0600 

30 

1 

0 

P NOPc  * P NEIT_ESS  NODE; 

Ppt»HO*iO 

31 

1 

0 

IP  P_NODK  NULL 

P°PH062  0 

THEn“gOTO  LOOP2; 

nRPH0f  30 

PRPH064  0 

32 

1 

0 

RPT2: 

PPPH0650 

P NO Dr  • P PEC l 

PPPH0660 

33 

1 

0 

REC.TO.RPcTpRCH  « * 1 ' P ; 

PPPH0670 
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FL/I  OPTIMIZING  COMPILER  PEEP_H:  PEOC  PEOIDEI; 


STHT 

LEV 

NT 

/•  STFP3  FILL  PEC. TO  AID  REC.FPO0 

•/  PP  PH  06  90 

PMPH0700 

34 

1 

0 

HP?  3: 

PPPH0710 

B « *0*0; 

PPPH0720 

35 

1 

0 

P.NODE  « P ESS  MODE  1j 

PRPH0730 

36 

1 

0 

LOCP3: 

PPPH0740 

IF  ( (-PEC.TOIPFC. FROM)  6 ( BB | NEC • G) ) 

P9PH0750 

THEM  DO; 

PRPH0760 

37 

1 

1 

BZ  * • 0*  B ; 

P»PH07?0 

3 8 

1 

1 

IF  P9  THFN  DO; 

PPPHO70O 

39 

1 

2 

BO  ■ • 0 • B ; 

n*»PH0790 

40 

2 

FP_0G  « P_0<l| 

popHOBOO 

4 1 

1 

2 

DC“l*1  ?0~M; 

oppHOBI 0 

42 

1 

3 

BU(F_UG(I))  ■ • 1 ' B ; 

ORPH0820 

43 

1 

3 

END;* 

pp  phor  io 

44 

1 

2 

DO  0*1  TO  MU; 

P P PH  0 04  0 

45 

1 

3 

IF  BU(fT) 

oppHOMSO 

THEM  CO  T* 1 TC  IT; 

P ® PH  0 06  0 

46 

1 

4 

BZ  ( ZCODE  (H#f ) ) * • 1*  B; 

PPPN0870 

47 

1 

4 

IHfc; 

P P PH  000  0 

48 

1 

3 

END; 

PPPHO09O 

49 

1 

2 

END; 

P»PH0900 

50 

1 

1 

ELSE  DO  If  * 1 to  NY; 

PPPH0910 

5 1 

1 

2 

BZ  (ZCODfc  (OH 9 If) ) * • 1*  B; 

PRPN0920 

52 

1 

2 

END; 

PP  PN  09 1 0 

53 

1 

1 

DC  Z*1  TO  NZ; 

PNPH0940 

54 

1 

2 

IF  BZ  (X) 

PPPH095  0 

THEN  DO; 

PPPM0960 

55 

1 

3 

PP  • P_NEKT7.  (Z); 

P«»PH0970 

56 

1 

3 

IF  PP-»*NULL 

OPPHO90O 

THt N DO; 

P ® PHO  99  0 

57 

1 

4 

PP  ■ PP->P^ESS  MOPE; 

""PH  1 0O0 

50 

1 

4 

IF  l-FFC.TO) CPP->®EC.TO 

PPPNIOIO 

THFM  H, PEC. TO  - * 1 • B ; 

PPPM1020 

59 

1 

4 

IF  (-»PP->REC.FPOfl)  fcRFC.FROH 

popHl^^O 

THEM  B,PP->PEC.FPOH  * *1»B; 

P • PH  1 04  7 

60 

1 

4 

END; 

PPP41050 

61 

1 

3 

END; 

PPPH 1*60 

62 

1 

2 

END; 

P0PH1O79 

63 

1 

1 

END; 

PP  FH 1 00  9 

64 

1 

0 

P_NOD!  • P_NPIT_ESS_NODl; 

P P P H 1 09  0 

65 

1 

0 

IF  P.ICDE  NULL 

PPPHl 100 

THEN  GOTO  IOOP3; 

PFPH1H0 

PPPHl 120 

66 

1 

0 

IF  B THEN  GOTC  RPT1 ; 

PPPH  1 HO 
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PL/I  OPTIMIZI MG  COMPILER  PREP_H:  PROC  REORDER; 


STBT 

L IV 

NT 

/•  STPP4  CHECK  r^R  CHAINS  NOT  CONTAINING  P_RPL 

•/  PRPH11S0 

P°PH 1 160 

67 

1 

0 

PP  * NULL; 

Ppr»H  1 170 

68 

1 

0 

P_NODB  = P_ESS_NODE_ 1 ; 

PPPH 1 180 

69 

1 

0 

L0CP4; 

PRPH 1 190 

IP  REC.PPOH  & (-.RPC. TO)  & (BB|PEC.G) 

pPPHl 200 

THEN  DO; 

PRDH121 0 

70 

1 

1 

P.REC  - P_NCDE; 

P°PH 1 220 

7 1 

1 

1 

GOTO  STEP2; 

PPPH 1 270 

72 

1 

1 

END; 

PBPHI 240 

73 

1 

0 

IP  (-PEC.TO)  & ( -»PEC  • PROH)  C (PB|RFC.G)  T HFN  PP*P  N^DE; 

PPPH 12S0 

74 

1 

0 

P_NCDE  = P_NFXT_ESS_NCDE; 

PP  PH  1 26  0 

PPPH 1 21 0 

7S 

1 

0 

IP  P_NODZ  -.=  NULL 

PRPH1280 

THFn'gCTC  L00P4; 

POPH1290 

PPPHl 300 

76 

t 

0 

IP  PP-»*  NU  LL 

!>FOH  1 310 

THEN  DC; 

PPPHl  320 

77 

1 

1 

P PEC  * PP; 

pc  p« 1 » )A 

7p 

1 

1 

GOTO  PP  r 2 ; 

PRPH1  ’40 

79 

1 

1 

END; 

o?PRl ’SO 

pppHl  if o 

/•  STEP*  *ILL  IN  PEC.G/P*C.H  (ACCORDING  TO  l)P) 

•/  08pHl 370 

POPHI  39  0 

no 

1 

0 

P NODE  * p ER*_NGDE_1; 

PppHl 390 

8 1 

1 

0 

LCCP5: 

PRPH1400 

IF  BP 

oc  TH 1 4 1 ") 

THEN  RPC.G  * PEC.r^  C PEC.PR01; 

PPPH  1 42 o 

92 

1 

0 

ELSE  REC.H  * PFC.G  o PEC.TO  C PEC. PROP; 

DOpHl 4 30 

83 

1 

0 

r_NODE  « P_NEIT_ESS_NCDA; 

PPPH  1440 

84 

1 

0 

IP  P.NODB  NULL 

popHl 4S0 

THEn'gO'FO  LOOI*; 

PRPH’460 

popHl 470 

85 

1 

0 

IP  PB 

PPPH 1 490 

THEN  DO; 

po*»Hl4  70 

86 

1 

1 

BB  « • 0*  P ; 

P*»OH  1S00 

37 

1 

1 

GOTO  STEPl; 

PRPH1S10 

88 

} 

1 

END; 

PPPH1S20 

89 

y 

0 

END; 

PFPH1S30 

J 
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PL/l  CPTIRIZIMG  COMPILER 


SCLFE.G:  PROC  REORDER; 


SOORCE  LI STI MG 


STMT  LEV  MX 


1 

2 

4 

5 

6 

7 
9 
10 
1 1 
12 

13 

14 

15 

16 

17 

18 

19 

20 


0 SOLFE.G : FROC  REORDER; 


0 

0 

0 

0 


tlMCLUDE  EDI  (DCL)  ; 

DCL  (I,J)  PIItD  PIN,  (S , S 5,T, TOL) 
PT  LA  BEL (RT  G,  RT  H)  , (B,BB) 
DCL  (WRK  ( N)  , M PK2  (N ) ) FLOAT  BIN; 
DCL  STROCT.FLAG (N)  FI IFD  PIM; 


FLOAT  BIM,  (P,P  LHS)  POIRTK®, 
BIT  ALIGNED; 

/•  LHS  MAX  AND  2ND  H A I , STEP 
/•  D^  DT  FOR  PREFIX  17 


0 P.LHS  * ADDP  (MRK  (1) ) ; FP.FLAG  • A DDR <STRnCT_FLAG ( 1))  I 

0 PT  * RT  G; 

0 G.STEPS7H. STEPS«0; 

0 TOL  * ERR* IE- 3; 

0 EPP  » 1F1C; 


0 P.NODE  « P.FSS^MCDE.I; 

0 G.IOOPO: 

0 “PP.M  « P_H;  FP.UG  ■ PJIG; 

0 Dc” I ■ 1 TO  M; 

1 DP  SKIP  ( 1 1 « SIGN  (FJG(I||  - 1; 

1 END ; 

0 P.MODE  * P_MEXT.FSS.NODE; 

0 I?  P.MrDE-»»ULL’ 

THFM*GOTO  G.LOCPO; 


SOLF901O 

SOLF0020 
SOLFOOJO 
pOLFOOuo 
SOL? 0050 
•/SOLF0060 
•/S^Lf 0070 
SOLIOCRO 
SOLE 0090 
SOLFOIOO 
SOLFOHO 
SOLV012O 
SOLVO110 
SOLFD 140 
SOLFO 150 
S^LF^IfeO 
SALFO  170 

SOLFOIflO 
SCLF0190 
50LF0200 
SOLF021O 
^rLF022  0 
S^l F0230 
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PL/I  OPTIfllZma  COMPILER  SOlfE_G:  PROC  REORDER; 

STET  LEV  NT 


21 

1 

0 

G_LCOP: 

SCLV  0250 

“g.HIHH  * -1E1P; 

S^L V9260 

22 

1 

0 

G. LOW  * IE  10; 

2 3 

1 

0 

G.STIFS  * G.STEPSO; 

Sr  L V 0 2«  0 

2** 

1 

0 

TOL  * TOL*1.2; 

S'M.V029O 

25 

1 

0 

S*0.  ; 

SrLVO 100 

26 

1 

0 

P_N0Dt  * P_FSS_NPDE_ 1 ; 

Sr  L VO  1 1 0 

Sr  LVO  120 

27 

1 

0 

G_L0CP1:  /•  COMPUTE  VG  « 8AX/0/  0 (0)  ♦ S"n/1/  PZ  VH 

*/Sr>LV0H0 

28 

1 

0 

rp_ur.  a P_OG;  PP_|G  « P_VG;  PP_N  * P_N; 

S'UVO  14  0 

3C 

1 

0 

PC~I»  1 T0~N ; 

S''  I.  VO  150 

3 1 

1 

1 

P VG  (1)  * - 1. RS; 

crLVO  1*0 

) 2 

1 

1 

END} 

sr  LVO  17  0 
Sr LVO J80 

JJ 

l 

0 

O'*  U*1  TO  NO; 

5 ''LVO  140 

3«* 

1 

1 

PP_QZ  * HOOP  (P_QZ->P_QZ  [ (O'H  1> ) ; 

LVO 400 
sot VOulO 

35 

1 

1 

B8  * *0*8; 

S^LVOUJO 

36 

1 

1 

DO  1*1  TO  4; 

s^L V04 10 

17 

1 

2 

B « DP.SKIP (I | *0  | (P  *06DP_SKIP  <I)  >0)  S 

S^l V^UNO 

18 

1 

2 

IP  P 

f'L V04S0 

T H P4  DO; 

SOt V^460 

39 

1 

1 

STRUCT  FLAG (I)  * 1; 

5CLVP470 

40 

1 

3 

N°K (I) ”«  r_QZ  (I)  ; 

Sr  LV  048  0 

4 1 

1 

3 

PP  * ' 1 ' B; 

SOI  VO 4 9 0 

42 

1 

1 

END; 

s^LVOSOO 

43 

1 

A 

ELSE  STPUCT_PLAG fl)  ■ 0; 

Sr LV0510 

44 

1 

2 

END; 

<?^LV0520 

45 

1 

1 

IP  -.BB  THEN  GOTO  NEXT_U; 

SUV0530 
SOL  V0*'40 

40 

1 

1 

GOTO  DP  OP; 

S''  LV  0 550 

47 

1 

1 

RT_G: 

S'HVOSGO 

DO  1*1  TO  N ; 

sr»t  V0570 

48 

1 

2 

IP  DP.SHTP(I) *0 

SOI V0580 

•"HEN  00; 

S0LV0590 

49 

1 

3 

IP  «RK <I)  >P_¥G  (I) 

SOL V0600 

THEM  DO; 

S^lVOMO 

50 

1 

4 

4RK2(X)  * P »G  (I)  ; 

SrL V06  ?0 

51 

1 

4 

P ?G (I)  ■ NRK  (I)  ; 

SOlfOMO 

52 

1 

4 

p“tJG(X)  * U; 

SOIVO640 

53 

1 

4 

END; 

SOW0650 

54 

1 

1 

ELSE  VPK2(I)  * 8AI  (BRK  (I)  , VRK  ? (I)  ) ; 

SOLV0660 

55 

1 

3 

fur; 

SOLV0670 

56 

1 

2 

END; 

SaLV0680 

57 

1 

1 

NeXT  U: 

SCLV0690 

END; 

S^LV0700 

PL/1  OPTIMIZING  COMPILER 


SOLfP.G:  PRPC  REORDER; 


STHT  LE»  NT 

SB 

0 

DO  T»1  TO  N; 

SOL? 0720 

59 

1 

IP  PP_S*IP(I)>0 

SOLE 0710 

THEM  DO; 

s^LPO  74  0 

60 

2 

DP.SNIP(I)  • PP_SNIP(I)  - 1; 

S^Lf 07S0 

6 1 

2 

P iG(I)  « URN  <i7 ; 

Snl ?0  76  0 

62 

2 

END ; 

SnL  P077  0 

63 

1 

ELSE  IP  DP  SNIP  (I) *0 

S^LPOTRO 

THEN  DP  SNIP(I)  * HIM  <100. , (P_FG (I) -NRN2 (I)  | /ER°)  ; 

S^LP0790 

64 

1 

IP  P OG(I)-.«0  THEN  S « (S»F  TG  (I)  ) +•  5; 

SOLPO0OO 

65 

1 

END;” 

SOL? 08 1 0 

66 

0 

P_NODE  * P_Nf XT_E5S_NCDE; 

SOLE 0820 

67 

0 

IF  P_NGDE-*NMLL” 

S^L  f 0 8 ■*  0 

THEN~SOTO  G_LCOP  1 ; 

SnLf 0 04  0 

SOLE 0850 

68 

0 

P NODE  * P ESS  NODE  1} 

SOLP0860 

69 

0 

G LOCP2:  /•  ” ?H~«  V G - S AND  GET  ODCNI  POUNDS 

•/S^LfOB^O 

70 

0 

"PPJJG  » PJIG;  PP_VG  * P_¥G;  PP.FH  * P_VH; 

SOIPO90O 

72 

0 

DO"l«1  To'h; 

SOLE 0090 

7 J 

1 

IF  P_UG(I)  -*  0 

SOLE 0900 

THEn'd''; 

S^Lf 091 0 

74 

2 

SS  « P VG  (I)  - P VH  (I)  ; 

SOLP0920 

75 

2 

G.HIGH"*  HAX  (SS#G. HIGH)  ; G.  LON  « MIN <SS , G. l*N) ; 

SOLE 0910 

77 

2 

P_VH(I)  * (P_f G (I) ♦P.fH (I) -S) •. 5; 

SOL? 0940 

78 

2 

END; 

S'U P09S0 

79 

1 

END; 

S°t V096  0 

80 

0 

P_NODP  * P_NEXT_ESS_NCDE; 

SOLV0970 

01 

0 

IP  P MODE  “•NOLL 

SOL f 098  0 

THPn“gOTC  G L<'OP2j 

SOt f 0990 

SOLV1000 

82 

0 

CALL  TIBING  (TIHE.G)  ; 

SOtPl^lO 

83 

0 

IP  TIflE.G  > TlHE.LIflrr  THEN  RETURN; 

SOLP10 20 

S^tPIOlO 

04 

0 

ERR  • G. HIGH  - G. LOU ; 

S^tPIONO 

05 

0 

IP  ERR  > TCL 

S0LV1 050 

THEN  GOTO  G_LCCP; 

S^tf 1060 

SOLP107O 

86 

1 0 

RETURN; 

SOtf 1O0Q 
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PL/ 1 OPTIHIZING  COHPILFR  SOLVE_G:  PROC  REORDER; 


STHT  LFV  NT 


87 

1 

0 

SOLVE_H:  ENTRI; 

SnLf 1 100 

88 

1 

0 

RT  * BT_H • 

S^LVIHO 

89 

1 

0 

TOL  * TOLMB-2; 

SOIV1 120 
S0LV11 30 

90 

1 

0 

H_LOOP: 

SOL  V 1 140 

9 1 

1 

0 

H«  HIGH  *" 1E10;  H.LOB  » 1E10; 

SrLVl 150 

92 

1 

0 

H*  STEPS  « H . STEPS ♦ 1 ; 

SOLV1 160 

91 

1 

0 

TOL  * TCL*2 ; 

S'HVI  170 

94 

1 

0 

S-0.  ; 

S "'L  V 1 180 

95 

1 

0 

P„NODE  * P ESS  NCDK.1; 

S^LV 1190 

96 

1 

0 

H_LCCP 1 : 

"if  -»REC.H  THEN  GOTO  HJ3UT1; 

S°L  V 1 200 
SnLV  1 21 0 

97 

1 

0 

FP_FLAG  * P_UG;  FP_«#P_LHS  * P^B; 

SOL  V 1?20 

99 

1 

0 

U » UH; 

SOLV1230 

100 

1 

0 

•P.QZ  * ADDR  (F_OZ->P_QZ  ( (U”1) *N*1) ) ; 

S^LV 1 240 

101 

1 

0 

DO”l*1  TC  N; 

S^LVI 250 

102 

1 

1 

IF  FLAG (I)  -=  0 

THEM  P_B (I)  « P.QZ (I) ; 

SO  L V 1 26  0 
S*iT.ri270 

103 

1 

1 

END; 

S^I  VI  2fl0 

104 

1 

0 

GO  Tri  DP_OP; 

Sr L?  1 29  0 

105 

1 

0 

RT  Ms 

DO  I«1  TO  N; 

SOLI 1 300 
S*LV1 310 

106 

1 

1 

IF  FLAG  (I ) 0 

THEN  S * (SFP.N (I) ) •. 5; 

S''  L F 1 320 
SOLV1 330 

107 

1 

1 

tND; 

SOI  VI  340 

108 

1 

0 

H CUT  Is 

P_mDP  • P N E IT  ESS_NODE; 

S^LVI 350 
SrLVI  360 

109 

1 

0 

IF  P iC  DE-»*NUIL  THEN  GOTO  H.LOCPI; 

SOL  V 1 370 

110 

1 

0 

P BODE  * P FSS  NOCE  1; 

SrL  V 1 380 

1 1 1 

1 

0 

H.LOCP2: 

IF  -RBC.H  THEN  GOTO  H_OUT2: 

S nL  V 1 390 
SOLV1400 

112 

1 

0 

FP.rtG  » P UG;  FP  U * P U;  FP_VH  ■ P^VH; 

SrLVl4l 0 

115 

1 

0 

Do”l * 1 TO~N; 

SnL V 1 420 

116 

1 

1 

IF  F 0G  ( I ) -»«  0 
THEN~DC; 

S-LV1430 

S^LVIaaO 

117 

1 

2 

ss  • P_B(I|  - F.fh  (T) ; 

SOL  V 1 450 

118 

1 

2 

H. HIGH  • HAI  (SsTh. HIGH)  ; H * LO B * HIN  (SS  ,H.  LCN)  ; 

S'*  L V 1 46  0 

120 

1 

2 

F V H ( I ) • (F  B ( I ) ♦ F FH(I)-S)*.5; 

SOLV1470 

121 

1 

2 

END; 

SOL  V 1 48  0 

122 

1 

1 

END; 

S^LV  1490 

123 

1 

0 

H.CUT2: 

P NODE  ■ P NEXT  ESS_NODI; 

SC  T V 1 50  0 
SOLV1510 

12« 

1 

0 

IF  P NCCE-**NULL~THEN  GOTO  H_LOCP2; 

s^LV 1 520 

125 

1 

0 

CALl” TIHI  NG  (TINE. H)  ; 

S^LV  1 5 30 

126 

1 

0 

IF  T I HE. H > TIHE. LI  HIT  THEN  RETURN; 

SOLV  1540 

127 

1 

0 

IF  H,  HIGH  - H.LON  > TCL 
THEN  GOTO  H^LCOP; 

S^Lf  1550 
S^LV 1 56  0 

128 

1 

0 

RETURN; 

SOL  V 1 570 
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Pl/I  CPTIEIZI EG  COEPILEP  STL*E_G:  PPCC  PEOPOEP; 


STRT 

LP? 

NT 

129 

1 

0 

DP  OP: 

SOLF159D 

SOL? 1 600 

S"L?1610 

/•  COHPTVTI  TRE  OPERATION  CF  DYNARIC  P ROC  R A REI  NG : 

•/ 

SOL? 1620 

/•  B « Q(U)  ♦ SOH/T  PZ (Z* (0, Y) ) •?_PEAS ( T<*,Z)) 

•/ 

SOL? 1630 

/•  IK  NODE  P NODE  BITH  U AS  SPECIFIED  AT  CALL  TIRE 

•/ 

SOL? 1640 

SO  L?  1 6*>  0 

S"L  ? 1 66  D 

DO  f*1  TO  NT; 

SOL? 1670 

1 iO 

1 

1 

Z * ZCOLE  (U  ,Y)  ; 

SOL? 1 6 9 0 

13  1 

1 

1 

IF  Z-.«0 

SOL?169^ 

THEN  CO; 

SOL?1 700 

132 

1 

2 

P * P NIITZ  (Z)  ; 

SOL? 1 1 1 0 

133 

1 

2 

IF  P***NULL 

S"L? 1 72  0 

THEN  DO; 

S"L  ? 1 7 30 

13  4 

1 

3 

FP  WH  * P->P  ESS  »ODE->9  9R; 

SOL? 1740 

135 

1 

3 

P * A DDR  ( P PZ- > P PZ  ( (Z-l7*N*N*1) ) ; 

SOL? 1750 

136 

1 

3 

DC  1*1  TC  N; 

S^LV1760 

137 

1 

4 

If  FI»G(I|  -*  0 

SO  L V 1 77  0 

THEN  Dr ; 

S"L?1 790 

138 

1 

5 

FP  PZ  » ADDR  (P->P  PZ  ( (1-1) •N*1|) ; 

S"L? 1790 

139 

1 

5 

S S ■ 0 . ; 

SOL? 1900 

140 

1 

5 

DO  J»1  TC  Nj 

S"L? 1 91 0 

14  1 

1 

b 

SS«  SS*  F PZ  (J)  • F » H (J) ; 

S"L? 1 920 

142 

1 

6 

END; 

SOL? 1430 

14  3 

1 

5 

P LHS->P  W(TI  * P LHS->  P H ( I)  ♦ SS; 

S"L ? 1 94  0 

144 

1 

5 

END; 

S"L  ? 1 950 

145 

1 

4 

END; 

S"LV I960 

146 

1 

i 

END ; 

SOL? 1 470 

147 

1 

2 

END; 

S"LV1«90 

1 4 fl 

1 

1 

FND; 

SOL V 1 99  0 

149 

1 

0 

GOT"  RT; 

SC  L V 1 90  0 

150 

1 

0 

END; 

SCL71 91 0 

PL/I  OPTIMIZING  COMPILER 


REPORT:  PPOC  RFORDER 


SOURCE  LISTING 


STMT 

LEV 

NT 

1 

0 

REPORT:  PROC  REORDER; 

RPTO010 

PPTOO?0 

2 

1 

0 

tINCLODE  CD1 (DCL) ; 

PPTOO 10 

•••/ 

pprooun 

/• 

•/ 

Pp^OO^O 

/•  PRINT  RESULTS 

•/ 

R P "0  060 

/• 

•/ 

PPT0070 

/*•••••••••**•*••••••••••*•*••**••••• 

•••/ 

RPTOOR  0 

4 

1 

0 

DCL  (I,.l)  PTXFO  BIN,  P POINTER  C CHAR(1)  ALIGNED; 

PPT0090 

5 

1 

0 

DCL  SCAN  LXT  EN~  P T ; 

opTOlOO 

PP"0 110 

6 

1 

0 

SIGNAL  F.NDPAGF  (SYSPRINT)  ; 

RPT0120 

P o TO  1 3 0 

7 

1 

0 

FP»  * S.HIliH  - H.ICB  ♦ 1.S-10; 

PPT014  o 

8 

1 

0 

P NODE, P RPL  = P ROC7; 

PP  TO  ISO 

9 

1 

0 

LEV, L 0,100  = 0; 

PPT0160 

10 

1 

J 

IP  P FSS  NODE-=NULL 

on-o  170 

THEN  GOTO  PD; 

RP-01R0 

tjp'T'O  19*> 

1 1 

1 

0 

LOCP: 

PP-020O 

CALL  SCAN; 

D PT021 0 

12 

1 

0 

IP  P NOD  E*»*  NUI  L 

HPT0  220 

THEN  GO^O  PD; 

np-02  J 0 

PPT0240 

1 3 

1 

0 

IP  PH  R<  * HIN  FhP  | 1 >«  MAX  1 | ESS  8 >*  (1  AX  PS?  1 

COT02SO 

| TIHF.G  >«  TIME. UNIT 

3PT0280 

•’’HEN  DO; 

5P«ro270 

14 

1 

1 

PUT  1 DI T ( • | • , * | •STOP*1)  (C^L (1) , A,C0L(RD)  ,A)  ; 

R PTO  28  o 

15 

1 

1 

STOP; 

PPT02°  0 

If) 

1 

1 

END; 

P PT  0 300 

17 

1 

0 

PETURN  ; 

R PTO  3 1 o 

257 


pl/i  optimizing  corpiler  REPORT:  PROC  REORDER; 

stht  LEV  NT 


Id 

1 

0 

PD: 

IP  LINFNO  (SYSTPINT)  > 55-N*PNT 
THEN  SIGNAL  FNDPAGE ( SYS FPTNT) ; 

RP‘r  0 3 1 0 
PPT034O 
RPT0350 

19 

1 

0 

POT  SKIP  (2)  ; 

RP TO  76 0 
PPT03T0 

20 

1 

0 

IP  PEC.G  THFN  POT  EDIT  ( * G • ) (COL ( 1 4)  , A)  ; 

PPTO  3«0 

21 

1 

0 

IP  R EC.  II  ''‘HEN  POT  ED  IT  ( • H ' ) (A)  ; 

RPTO  39  0 
op^OUOO 

22 

1 

0 

J * OH; 

PPT0410 

23 

1 

0 

PUT  EDIT  (J)  (COL  (19)  ,P(  3)  ) ; 

PPT0420 
RPT04  ID 

24 

1 

0 

PPJIG  = PJUJ; 

P PTD 4 4 0 

25 

1 

0 

C * • * • ; 

RPT04S0 

26 

1 

0 

DO  1=1  TC  N; 

p PT046  0 

27 

1 

1 

TP  P Of?  ( I ) ■*=  0 0 P_UG(I)-*J 
THEN_DO ; 

PPT0470 

op-rOUPO 

20 

1 

2 

C = • • ; 

? PTO  4 90 

29 

1 

2 

GOTO  STAR^OOT; 

PPT0500 

30 

1 

2 

END; 

PP^OSI 0 

31 

1 

1 

END ; 

PPT0520 

32 

1 

0 

STAR_CUT: 

PUT  EDIT  (C)  (A)  ; 

PPTO  5 70 
pPT05u'i 
R P T0  550 

33 

1 

0 

IP  P NCDF  = P_POCT 

THEN”PUT  lDIt7'<E>')  (COL(73),A); 

PPT0S60 
apT/)*;?  n 

34 

1 

0 

ELSE  DO; 

opTO^PO 

35 

1 

1 

PUT  EDIT  (Z_bACK  ) (COL  (HA  X (1,76- LEV*  3)  ) , P ( 1)  ) ; 

P nT069  0 

36 

1 

1 

F * P_N0DE7 

r-PT0600 

37 

1 

1 

DO  IXLPY*  2 TC  LO  BY  -1; 

0 PT06 1 0 

30 

1 

2 

f = p->p_oack: 

P PT062  9 

39 

1 

2 

PUT  EDIT (P->Z_BACK)  (P  (3)  ) ; 

°PT06  30 

40 

1 

2 

END; 

PPT0640 

41 

1 

1 

END ; 

opT06^D 

°PT0F60 

42 

1 

0 

IP  PHT* 0 

THEN  GCTC  LOOP; 

PPT0670 
PPTOf 49 
P P T 0 6 9 0 

43 

1 

0 

PP_TPfl  « P.TPH ; FP_VG  * P_YG;  EP_¥H  « P_VH; 

PPT07Q0 

46 

1 

0 

DC~I* 1 TC  N; 

*>PT0710 

47 

1 

1 

IP  P_OG(I)-.*0 
THEN*DC; 

PPT0729 
p ° TO  7 1 0 

4 0 

1 

2 

PUT  EDIT  (I#P  U G ( I ) # P ?G  (I)  ) (COl(16)  ,2  f ( 3)  , P (6 , 2)  ) ; 

PPTO  7#  9 

49 

1 

2 

IP  FPC.H  THPN  PUT  EDIT  (f  _ ffl  (I ) ) (*(6,2)); 

®PT07S9 

50 

1 

2 

PUT  EDIT  ( (P_TPN  ( (I*  1)  *N*  J)  DO  J*1  TO  N) ) (CCL ( 34 ) , 5 P(*,4)); 

P PT076  0 

51 

1 

2 

END; 

PPT0770 

52 

1 

1 

END; 

PP*»,0  7P0 

53 

1 

0 

G'lTO  LOOP; 

PPT07Q0 

54 

1 

0 

END; 

PPTOOOO 

r 


^'7 
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PL/I  CPTiniZING  CGBIILEF 


SCAN:  PPOC  "FORDED; 


500 SC E LICTTNG 


ST  flT  LFV  NT 


1 

0 

SCAN:  pp^c  PtOHLEP; 

,'CANr>01  7 
SC 1 no«20 

1 

0 

^INCLUDE  CPI  (DCL)  ; 

/•  PINE  N r XT  FS3  N^CE  APTKF  P_N~D°.  TN  TREE  CRDFP  •/ 

eC  A N 10  ’0 
cr*  AN  0040 
SC* NOOSO 

4 

1 

0 

ECL  T FTXFD  BIN; 

SCANOO*0 

5 

1 

0 

LO  » LFV;  , 

SCAN 

8 

1 

0 

N * b_N  TDt : 
I * 0; 

SCAN  ?■*'»') 
SCAN'  70  3 

7 

1 

0 

CLIMB: 

FP_BPANCHl5  = r_bpANCHES; 

SCASTIO^ 
5CAN01 1 n 

R 

1 

0 

Dn”z  = T ♦ 1 Tr  N Z ; 

S ^ A N 0 1 2 0 

Q 

1 

1 

T*  F_E_DNANCM  (Z)  & P_  B*>  A NCH  ( Z)  -=  N ULL 

THFN~C.CTC  N f X""_L  c V ; 

SCAN7 1 JO 
sc  AN  omo 

10 

1 

1 

END ; 

/•  ALL  UR  AM  HrS  HA  VP  PFeN 
PXPL^FF'l,  GC  PA  CP  D WN 

STAN7  1e  f> 
S r A N 0 1 8 * 
• /s  7 A N 7 * 7 

1 1 

1 

0 

DCWN : 

IF  LtV'LOO 
THFN  CO; 

“CAN  ) 1“  ' 
SCANT  101 

1/ 

1 

1 

P^N^Dc  * NUIL; 

r»S  1 . 1 0 

1 3 

1 

1 

PhTUPN  ; 

14 

1 

1 

k N 0 ; 

SC*  N021O 

IS 

1 

0 

LEV , LO  * LKV*1; 

SCAN  ) ? 4 7 

18 

1 

0 

I = Z_BATK ; 

sresrv>sn 

17 

1 

0 

P_NODF  * P— PACK ; 

SCR  Nn?6P 

18 

1 

0 

P~F  t L * P_PEI  ->  n_bACK ; 

SCANT  17  * 

14 

1 

0 

GC  To  CLI  IP  ; 

/•  CLlHb  P ? ARCH  7 

*‘r‘*  N T?M  0 
* /SC  A n 7->oo 

20 

1 

0 

NE*T_LPV: 

LFV  « LF V ♦ 1 ; 

S c a N 7 V-n 
S'-**!  Til  o 

21 

1 

0 

P_NODE  « F_P^  PP  A NC  H (Z)  ; 

S ' A N 7 7 2 0 

22 

1 

0 

P~P  EL  • P_PEL  ->  P.RFANCffES  ->  F_P_f*7A  NCH  (ZJ  ; 

cCA. 4 3770 

23 

1 

0 

PP.BPANCHES  * P_B  P A N CM  PS ; 

SC A NO  J4  0 

24 

1 

0 

DC”z  « 1 To  N7 ; ” 

SCANT  7^,0 

25 

1 

1 

IP  P„E_PRANCH  (Z)  6 P_P_PHANCH <Z) *NULL 
THFn“®E’,URN; 

SCANT  J*  T 

SC>  N 7 170 

26 

1 

1 

END; 

SCANO  140 

27 

1 

0 

G^TO  NPN_NODF. ; 

SC.A  N3  70 0 

28 

1 

0 

*ND; 

sc A NO NOT 
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D-sense  spread  of  normalized  range,  100 

Detectability  index,  49,53,115 

Finite-horizon  weights,  28 

Connected  class  of  states,  81 

Detectable  classification  of  states  at 
time  k,  118 

Metric  on  II. ,,  87, 94-96 
N 

Unit  vector,  11 
Etttpty  word,  62 

Essential  part  of  memory  set,  70 

Expectation  under  strategy  y,  26 

Performance  indices,  28 

Perceptive  gain,  145 

Pseudo-perceptive  gain,  147 

Possible  states  (preceding,  following) 
evolution  of  z,  63 

Time,  21-22 

Horizon,  22 

Length  of  word  z^,  62 

(Reachability,  detectability)  time 
constant,  48-49,  (82,115) 

Discounted  time  interval,  135 

Memory  set , 66 

Value-iteration  step,  128,136 
Number  of  states,  21 
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Iteration  number,  37-38,145 

Transition  probabilities  of  augmented 
system,  76-77 

Transition  probability  matrix,  21 

Probability  under  strategy  y,  25-26 

Expected  incremental  reward  at  time  k,  28 

Expected  incremental  reward  vector,  29 

Expected  incremental  rewards  for  augmented 
system,  76-77 

Bounds  on  expected  incremental  rewards,  29 
Reward  at  time  k. 

Row  of  a matrix,  11 
N-dimensional  Euclidean  space,  11 
State  at  time  k,  21 
State  set,  21 

Information  vector  transition  function, 
26,64 

Memory  state  transition  function,  68 
Input  at  time  k,  21 
Input  set,  21 

Finite-horizon  value  function,  125 

Infinite-horizon  relative  value  function, 
134 

Banach  space  of  continuous  bounded  real- 
valued functions  on  IT  , 96 
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Augmented  state  at  time  k,  75 
Augmented  state  set,  75 
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Connected  class  of  augmented  states,  83 

Output  at  time  k,  21 

Output  set,  21 

Memory  state  at  time  k,  66 

Set  of  input-output  pairs,  62 

Set  of  input-output  words,  64 

Positive  part,  11 

(Integers,  reals)  between  a and  b,  11 

Silbttaction  of  rightmost  part  of  word,  65 

Bayes'  operator,  51,82 

Sum  of  vector  components,  12 

Sup  norm,  96 

Variation  of  convex  function,  97-98 
A-sense  contraction,  100 
Detectability  index,  53,106,109,112,114 
Discount,  28 

Decision  strategy,  24,78 

Hajnal  measure,  87,93 

Metric  on  11^,  87,89 

Information  vector  at  time  k,  26 

Number  of  detectable  classes,  118 

Initial  state  probability  vector,  21 
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vectors,  11 
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Reachability  index,  48,82 

Policy  compatability  flag,  114 

Elasticity  of  memory  effectiveness,  16-17, 
106,109,112,114 

Feasible  strategy  and  the  policy  that 
realizes  it,  78 

Pseudo-perceptive  strategy  derived  from 
ipM  , 146-147 

Optimal  feasible  strategy,  134 

Set  of  feasible  strategies  adapted  to  M,  78 

Connfectivity  index,  81-82 

Perceptive  strategy  and  the  policy  that 
realizes  it,  79 

Optimal  perceptive  strategy  adapted  to  M, 
146 

Set  of  perceptive  strategies  adapted  to  M, 
79 

Value  of  information,  50,135 
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GLOSSARY 


accept : The  action  in  which  a system  receives  an  input,  21. 

augmentation : Transformation  of  an  FPS  to  one  having  augmented  states, 

58,76. 

augmented  state : Transformed  state  consisting  of  a delayed  internal 

state  and  a memory  state,  57,75. 

concatenation : Two  or  more  words  (strings)  placed  end  to  end  so  as  to 
form  a single  word,  62. 

connectivity : A relation  between  states  i and  j indicating  that  the 

system  in  state  i may  eventually  enter  state  j provided  that  suitable 
inputs  are  selected  in  the  interim,  81. 

controller : A dynamical  realization  of  the  decision  strategy,  24. 

control  problem:  The  problem  of  designing  a controller  which  realizes 

an  optimal  or  e-optimal  strategy,  31. 

decision  strategy : A (possibly  probabilistic)  rule  for  the  selection 

of  plant  inputs,  24. 

detectability:  A condition  under  which  the  information  vector  is 

increasingly  insensitive  to  increasingly  delayed  information,  105-106,53. 

emit : The  action  in  which  an  output  is  generated  by  the  system,  21. 

essential  memory  state : A memory  state  that  is  recurrent  under  some 

policy,  70. 

est lmat Ion  prob ! em:  The  problem  of  recursively  computing  an  estimator 

or  sufficient  statistic.  In  the  case  of  an  FPS,  the  estimator  is  the 
information  vector,  30. 

feasible : A strategy  is  feasible  if  it  can  be  realized  on  the  basis  of 

available  information;  otherwise  it  is  perceptive,  78. 

finite-memory  constraint : The  constraint  that  a decision  strategy  be 

realizable  by  a finite-state  automaton,  24. 

finite  probabilistic  system:  A discrete-time,  finite-input,  finite 

output  finite-state  stationary  controlled  stochastic  process,  13,20-22. 

FPS : See  "finite  probabilistic  system." 
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f ree  FPS : An  FPS  whose  input  set  contains  but  one  element,  i.e.  an 

FPS  whose  input  process  may  be  ignored. 

f ree  system  induced  by  a_  decision  strategy : The  system  which  results 

when  a plant  and  its  controller  are  considered  as  a single  unit,  23-24. 

horizon:  Length  of  the  time  set,  22. 

infinitely  delayed  splurge : Phenomenon  arising  in  the  absense  of  detec- 

tability, 48,142. 

Information  vector:  A vector,  which  may  be  computed  by  an  observer, 

whose  i-th  entry  is  the  a posteriori  probability  that  the  system  is  in 
state  i,  26. 

information  vector  transition  function:  The  rule  by  which  an  observer 

updates  the  information  vector,  26. 

memory  set:  A vocabulary  of  input-output  words  available  to  the  observer, 

57,65-66. 

memory  state : The  word  of  most  recent  input-output  pairs  retained  by 

the  observer,  57,65-66. 

memory  state  transition  function : The  rule  by  which  an  observer  updates 

the  memory  state,  68. 

memory  tree:  A graphical  representation  of  the  memory  set,  66-68. 

observer : A system  which  accepts  plant  outputs  and  computes  the 

information  vector  (or  an  approximation  thereof) , 30. 

perception:  An  output  which  has  been  artificially  added  to  the  plant  to 
facilitate  computation,  35,54. 

plant : The  system  to  be  observed  or  controlled,  13,18. 

policy:  A finite  array  which  specifies  the  decision  strategy,  14,78-79. 

pseudo-percept  ion : An  approximation  to  a perception,  obtained  by  guessing 

the  value  of  the  perception  on  the  basis  of  the  memory  state,  54,146. 

reachability : A condition  under  which  the  state  of  an  FPS  can  be  made 

to  assume  a desired  value  with  probability  bounded  from  below,  for  any 
initial  state  probability  vector,  48,82. 

realization:  Specification  of  system  components  which  will  act  according 

to  a given  rule,  e.g.  a controller  realizes  a decision  strategy,  14,24. 
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representation  : Specification  according  to  a particular  system  of 

notation,  20,22. 

reward : The  component  of  a performance  index  which  depends  on  an  parti- 

cular input-output  pair  as  well  as  the  states  preceeding  and  following 
it,  27;  the  expected  incremental  reward  depends  only  an  input  and  the 
state  preceeding  it,  28-29. 

state-calculability : A possible  FPS  property,  given  by  (2.3),  23. 

state-observability : A possible  FPS  property,  given  by  (2.4),  23. 

statistical  decision  problem:  A control  problem  in  which  plant  dynamics 

are  unaffected  by  input  values,  30. 

strategy : See  decision  strategy. 

subrectangularity : A propei i "f  ubstochastic  matrices  given  by  (13.1) 

99;  also,  a possible  property  of  FPS's  given  by  (14.1)  and  (14.7), 
105,106,109. 

SDT : Strong  detectability. 

SSR : Strong  subrectangularity. 

valued  finite  probabilistic  system:  An  FPS,  along  with  a process  of 

incremental  rewards  or  expected  incremental  rewards,  making  possible 
the  definition  of  performance  indices  as  a function  of  strategy,  28. 

VFPS : See  "valued  finite  probabilistic  system." 

WOT:  Weak  detectability. 

WSR : Weak  subrectangularity. 
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Notions  of  reachability  and  detectability  in  FPS's  (similar  to  controll- 
ability and  observability  in  linear  systems)  are  made  precise.  It  is  shown 
that  every  FPS  can  be  reduced  to  components  that  are  either  reachable  and 
detectable,  or  transient,  or  null-recurrent. 

It  is  well  known  that  the  information  vector  (whose  i-th  entry  is 
the  a^  posteriori  probability  that  the  system  is  in  state  i)  is  a sufficient 
statistic  (for  the  estimation  of  future  dynamics  given  past  inputs  and  out- 
puts) . A contraction  property  of  the  information  vector  transition  function 
is  exploited  to  obtain  procedures  for  f-optimal  (arbitrarily  close)  approxima- 
tion of  the  information  vector  by  a deterministic  time-invariant  finite-memory 
observer.  Each  observer  state  corresponds  to  a particular  configuration  of 
most  recent  input-output  pairs.  The  average  error  achieved  by  such  an 

approximation  is  bounded  by  the  expression  , , . -T 

(m/m^)  , where  m^  and  T are 

parameters  associated  with  the  observed  system,  and  m is  the  number  of  observer 
states . 

Control  problems,  in  which  the  average  reward  is  maximized  over  a dis- 
counted or  undiscounted  infinite  horizon,  may  be  solved  by  an  iterative  pro- 
cedure which  has  been  given  the  name  perceptive  dynamic  programming.  Success- 
ively weaker  assumptions  that  the  controller  "perceives"  unavailable  state 
values  transform  the  problem  into  a sequence  of  formulations  which  may  be  solv- 
ed by  dynamic  programming.  Each  solution  obtained  in  this  manner  is  used  to 
construct  a feasible  controller  formulation,  taking  the  form  of  a determin- 
istic time-invariant  finite-state  automation.  Monotone  geometrically  con- 
vergent bounds,  containing  both  the  supremum  feasible  performance  and  that  of 
the  current  design,  are  also  obtained.  Computation  may  be  terminated  when 
these  bounds  become  sufficiently  close,  or  when  the  number  of  controller  states 
becomes  excessively  large.  Although  computing  a solution  by  perceptive 
dynamic  programming  may  require  considerable  time  and  storage,  both  are 
roughly  proportional  to  the  number  of  controller  states  allowed  in  the  final 
iteration;  thus  the  cost  of  controller  design  reflects  the  cost  of  controller 
implementation. 

This  procedure  was  applied  to  idealized  problems  of  machine  maintenance 
and  computer  communication,  both  of  which  had  been  investigated  by  other 
researchers.  The  first  problem  was  solved  exactly;  a design  suitable  close 
to  the  optimum  was  obtained  for  the  second  problem. 
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